Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create LATENCY-OPTIMIZED-MODE.md #877

Merged
merged 2 commits into from
Nov 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions doc/LATENCY-OPTIMIZED-MODE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Latency Optimized Mode in Intel® Xeon® 6 Processors

Intel® Xeon® 6 Processors (previously codenamed Granite Rapids and Sierra Forest/Birch Stream platform) introduce a new power management mechanism called Efficiency Latency Control (ELC), designed to optimize performance per watt. This feature allows hardware power management algorithms to balance the trade-off between latency and power consumption. For latency-sensitive workloads, further tuning can be performed to achieve the desired performance.

The hardware monitors the average CPU utilization across all cores at regular intervals to determine an appropriate uncore frequency. While this approach generally results in optimal performance per watt, some workloads may achieve higher performance at the expense of increased power consumption. For instance, an application that intermittently performs memory reads on an otherwise idle system may experience delays if the hardware lowers the uncore frequency, causing a lag in ramping up to the required performance levels. To verify this, the uncore frequencies can be monitored using the pcm utility:

![Uncore Frequency Statistics DEFAULT](https://github.com/user-attachments/assets/108c7350-3fc2-4056-aeaf-ecc7c25da6bc)

The screenshot above presents real-time data on uncore frequency statistics, measured in GHz, from a dual-socket platform (represented by two rows). Each socket includes five dies (organized into five columns). The first three dies contain CORes (COR), Last Level Cache (LLC), and Memory controllers (M), collectively referred to as CORLLCM. The final two dies are IO dies.

The ELC control has parameters that can be adjusted either through BIOS or software tools. The default parameter configuration is optimized for performance per watt, ensuring power efficiency. The alternative configuration, known as Latency Optimized Mode, prioritizes maximum performance.
Below are the PCM statistics from a system operating in Latency Optimized Mode:

![Uncore Frequency Statistics Latency Optimized Mode](https://github.com/user-attachments/assets/70310bbc-725b-4450-af7a-1db2c04291dd)

## BIOS Options for Latency Optimized Mode

The BIOS option for selecting the Default or Latency Optimized Mode can typically be located in the following menus, depending on the BIOS version and OEM vendor:
- **Socket Configuration** -> **Advanced Power Management** -> **CPU – Advanced PM Tuning** -> **Latency Optimized Mode** (Disabled or Enabled)
- **System Utilities** -> **System Configuration** -> **BIOS/Platform Configuration (RBSU)** -> **Power and Performance Options** -> **Advanced Power Options** -> **Efficiency Latency Control** (Default or Latency Optimized mode)

Should this BIOS option be unavailable or if there is a preference to change the mode during runtime, the PCM repository provides scripts for changing this mode.

|Platform |Script Type| URL |
|------------------|-----------|---------------------------------------------------------------------|
|Linux/FreeBSD/UNIX|bash | https://github.com/intel/pcm/blob/master/scripts/bhs-power-mode.sh |
|Windows |powershell | https://github.com/intel/pcm/blob/master/scripts/bhs-power-mode.ps1 |

The scripts require the pcm-tpmi utility. There are several methods to obtain this utility:
- **Download or install precompiled PCM binaries:** Please refer to the following link: [Downloading Pre-Compiled PCM Tools](https://github.com/intel/pcm?tab=readme-ov-file#downloading-pre-compiled-pcm-tools)
- **Compile the utility:** Follow the instructions in the "Building PCM Tools" section available at: [Building PCM Tools](https://github.com/intel/pcm?tab=readme-ov-file#building-pcm-tools)
* For Linux/FreeBSD: Copy the pcm-tpmi utility from PCM’s source 'build/bin' directory to `/usr/local/bin/` or execute `make install` in the 'build' directory.

For Windows: Copy the pcm-tpmi utility to the current directory.

Once the pcm-tpmi binary is correctly placed, you can set the Latency Optimized Mode.

### Setting Latency Optimized Mode

Linux/FreeBSD/UNIX:
```
bash bhs-power-mode.sh --latency-optimized-mode
```
Windows:
```
.\bhs-power-mode.ps1 --latency-optimized-mode
```

### Restoring the Default Mode

Linux/FreeBSD/UNIX:
```
bash bhs-power-mode.sh --default
```

Windows:
```
.\bhs-power-mode.ps1 --default
```


132 changes: 132 additions & 0 deletions scripts/bhs-power-mode.ps1
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
Write-Output "Intel(r) Performance Counter Monitor"
Write-Output "Birch Stream Power Mode Utility"
Write-Output ""

Write-Output " Options:"
Write-Output " --default : set default power mode"
Write-Output " --latency-optimized-mode : set latency optimized mode"
Write-Output ""

# Run the pcm-tpmi command to determine I/O and compute dies
$output = pcm-tpmi 2 0x10 -d -b 26:26

# Parse the output to build lists of I/O and compute dies
$io_dies = @()
$compute_dies = @()
$die_types = @{}

$output -split "`n" | ForEach-Object {
$line = $_
if ($line -match "instance 0") {
$die = $line -match 'entry (\d+)' | Out-Null; $matches[1]
if ($line -match "value 1") {
$die_types[$die] = "IO"
$io_dies += $die
} elseif ($line -match "value 0") {
$die_types[$die] = "Compute"
$compute_dies += $die
}
}
}

if ($args[0] -eq "--default") {
Write-Output "Setting default mode..."

foreach ($die in $io_dies) {
# EFFICIENCY_LATENCY_CTRL_RATIO (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 28:22 -w 8

# EFFICIENCY_LATENCY_CTRL_LOW_THRESHOLD (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 38:32 -w 13

# EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 46:40 -w 120

# EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD_ENABLE (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 39:39 -w 1
}

foreach ($die in $compute_dies) {
# EFFICIENCY_LATENCY_CTRL_RATIO (Uncore Compute)
pcm-tpmi 2 0x18 -d -e $die -b 28:22 -w 12
}
}

if ($args[0] -eq "--latency-optimized-mode") {
Write-Output "Setting latency optimized mode..."

foreach ($die in $io_dies) {
# EFFICIENCY_LATENCY_CTRL_RATIO (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 28:22 -w 0

# EFFICIENCY_LATENCY_CTRL_LOW_THRESHOLD (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 38:32 -w 0

# EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 46:40 -w 0

# EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD_ENABLE (Uncore IO)
pcm-tpmi 2 0x18 -d -e $die -b 39:39 -w 1
}

foreach ($die in $compute_dies) {
# EFFICIENCY_LATENCY_CTRL_RATIO (Uncore Compute)
pcm-tpmi 2 0x18 -d -e $die -b 28:22 -w 0
}
}

Write-Output "Dumping TPMI Power control register states..."
Write-Output ""

# Function to extract and calculate metrics from the value
function ExtractAndPrintMetrics {
param (
[int]$value,
[int]$socket_id,
[int]$die
)

$die_type = $die_types[$die]

# Extract bits and calculate metrics
$min_ratio = ($value -shr 15) -band 0x7F
$max_ratio = ($value -shr 8) -band 0x7F
$eff_latency_ctrl_ratio = ($value -shr 22) -band 0x7F
$eff_latency_ctrl_low_threshold = ($value -shr 32) -band 0x7F
$eff_latency_ctrl_high_threshold = ($value -shr 40) -band 0x7F
$eff_latency_ctrl_high_threshold_enable = ($value -shr 39) -band 0x1

# Convert to MHz or percentage
$min_ratio = $min_ratio * 100
$max_ratio = $max_ratio * 100
$eff_latency_ctrl_ratio = $eff_latency_ctrl_ratio * 100
$eff_latency_ctrl_low_threshold = ($eff_latency_ctrl_low_threshold * 100) / 127
$eff_latency_ctrl_high_threshold = ($eff_latency_ctrl_high_threshold * 100) / 127

# Print metrics
Write-Output "Socket ID: $socket_id, Die: $die, Type: $die_type"
Write-Output "MIN_RATIO: $min_ratio MHz"
Write-Output "MAX_RATIO: $max_ratio MHz"
Write-Output "EFFICIENCY_LATENCY_CTRL_RATIO: $eff_latency_ctrl_ratio MHz"
if ($die_type -eq "IO") {
Write-Output "EFFICIENCY_LATENCY_CTRL_LOW_THRESHOLD: $eff_latency_ctrl_low_threshold%"
Write-Output "EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD: $eff_latency_ctrl_high_threshold%"
Write-Output "EFFICIENCY_LATENCY_CTRL_HIGH_THRESHOLD_ENABLE: $eff_latency_ctrl_high_threshold_enable"
}
Write-Output ""
}

# Iterate over all dies and run pcm-tpmi for each to get the metrics
foreach ($die in $die_types.Keys) {
$output = pcm-tpmi 2 0x18 -d -e $die

# Parse the output and extract metrics for each socket
$output -split "`n" | ForEach-Object {
$line = $_
if ($line -match "Read value") {
$value = $line -match 'value (\d+)' | Out-Null; $matches[1]
$socket_id = $line -match 'instance (\d+)' | Out-Null; $matches[1]
ExtractAndPrintMetrics -value $value -socket_id $socket_id -die $die
}
}
}
Loading