-
Notifications
You must be signed in to change notification settings - Fork 97
RDPMC
RDPMC — Read Performance-Monitoring Counters
Opcode* | Instruction | Op/ En | 64-Bit Mode | Compat/ Leg Mode | Description |
0F 33 | RDPMC | ZO | Valid | Valid | Read performance-monitoring counter specified by ECX into EDX:EAX. |
Op/En | Operand 1 | Operand 2 | Operand 3 | Operand 4 |
ZO | NA | NA | NA | NA |
The EAX register is loaded with the low-order 32 bits. The EDX register is loaded with the supported high-order bits of the counter. The number of high-order bits loaded into EDX is implementation specific on processors that do no support architectural performance monitoring. The width of fixed-function and general-purpose performance coun- ters on processors supporting architectural performance monitoring are reported by CPUID 0AH leaf. See below for the treatment of the EDX register for “fast” reads.
The ECX register specifies the counter type (if the processor supports architectural performance monitoring) and counter index. Counter type is specified in ECX[30] to select one of two type of performance counters. If the processor does not support architectural performance monitoring, ECX[30:0] specifies the counter index; other- wise ECX[29:0] specifies the index relative to the base of each counter type. ECX[31] selects “fast” read mode if supported. The two counter types are: • General-purpose or special-purpose performance counters are specified with ECX[30] = 0: The number of
general-purpose performance counters on processor supporting architectural performance monitoring are reported by CPUID 0AH leaf. The number of general-purpose counters is model specific if the processor does not support architectural performance monitoring, see Chapter 18, “Performance Monitoring” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. Special-purpose counters are available only in selected processor members, see Table 4-16. Fixed-function performance counters are specified with ECX[30] = 1. The number fixed-function performance •
counters is enumerated by CPUID 0AH leaf. See Chapter 18, “Performance Monitoring” of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. This counter type is selected if ECX[30] is set.
The width of fixed-function performance counters and general-purpose performance counters on processor supporting architectural performance monitoring are reported by CPUID 0AH leaf. The width of general-purpose performance counters are 40-bits for processors that do not support architectural performance monitoring coun- ters. The width of special-purpose performance counters are implementation specific.
Table 4-16 lists valid indices of the general-purpose and special-purpose performance counters according to the DisplayFamily_DisplayModel values of CPUID encoding for each processor family (see CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A).
Table 4-16. Valid General and Special Purpose Performance Counter Index Range for RDPMC
Processor Family | DisplayFamily_DisplayModel/ Other Signatures | Valid PMC Index Range | General-purpose Counters |
P6 | 06H_01H, 06H_03H, 06H_05H, 06H_06H, 06H_07H, 06H_08H, 06H_0AH, 06H_0BH | 0, 1 | 0, 1 |
Processors Based on Intel NetBurst microarchitecture (No L3) | 0FH_00H, 0FH_01H, 0FH_02H, 0FH_03H, 0FH_04H, 0FH_06H | ≥ 0 and ≤ 17 | ≥ 0 and ≤ 17 |
Pentium M processors | 06H_09H, 06H_0DH | 0, 1 | 0, 1 |
Processors Based on Intel NetBurst microarchitecture (No L3) | 0FH_03H, 0FH_04H) and (L3 is present) | ≥ 0 and ≤ 25 | ≥ 0 and ≤ 17 |
Table 4-16. Valid General and Special Purpose Performance Counter Index Range for RDPMC (Contd.)
Processor Family | DisplayFamily_DisplayModel/ Other Signatures | Valid PMC Index Range | General-purpose Counters |
Intel® Core™ Solo and Intel® Core™ Duo processors, Dual-core Intel® Xeon® processor LV | 06H_0EH | 0, 1 | 0, 1 |
Intel® Core™2 Duo processor, Intel Xeon processor 3000, 5100, 5300, 7300 Series - general-purpose PMC | 06H_0FH | 0, 1 | 0, 1 |
Intel® Core™2 Duo processor family, Intel Xeon processor 3100, 3300, 5200, 5400 series - general-purpose PMC | 06H_17H | 0, 1 | 0, 1 |
Intel Xeon processors 7400 series | (06H_1DH) | ≥ 0 and ≤ 9 | 0, 1 |
45 nm and 32 nm Intel® Atom™ processors | 06H_1CH, 06_26H, 06_27H, 06_35H, 06_36H | 0, 1 | 0, 1 |
Intel® Atom™ processors based on Silvermont or Airmont microarchitectures | 06H_37H, 06_4AH, 06_4DH, 06_5AH, 06_5DH, 06_4CH | 0, 1 | 0, 1 |
Next Generation Intel® Atom™ processors based on Goldmont microarchitecture | 06H_5CH, 06_5FH | 0-3 | 0-3 |
Intel® processors based on the Nehalem, Westmere microarchitectures | 06H_1AH, 06H_1EH, 06H_1FH, 06_25H, 06_2CH, 06H_2EH, 06_2FH | 0-3 | 0-3 |
Intel® processors based on the Sandy Bridge, Ivy Bridge microarchitecture | 06H_2AH, 06H_2DH, 06H_3AH, 06H_3EH | 0-3 (0-7 if HyperThreading is off) | 0-3 (0-7 if HyperThreading is off) |
Intel® processors based on the Haswell, Broadwell, SkyLake microarchitectures | 06H_3CH, 06H_45H, 06H_46H, 06H_3FH, 06_3DH, 06_47H, 4FH, 06_56H, 06_4EH, 06_5EH | 0-3 (0-7 if HyperThreading is off) | 0-3 (0-7 if HyperThreading is off) |
Processors based on Intel NetBurst microarchitecture support “fast” (32-bit) and “slow” (40-bit) reads on the first 18 performance counters. Selected this option using ECX[31]. If bit 31 is set, RDPMC reads only the low 32 bits of the selected performance counter. If bit 31 is clear, all 40 bits are read. A 32-bit result is returned in EAX and EDX is set to 0. A 32-bit read executes faster on these processors than a full 40-bit read.
On processors based on Intel NetBurst microarchitecture with L3, performance counters with indices 18-25 are 32- bit counters. EDX is cleared after executing RDPMC for these counters.
In Intel Core 2 processor family, Intel Xeon processor 3000, 5100, 5300 and 7400 series, the fixed-function perfor- mance counters are 40-bits wide; they can be accessed by RDMPC with ECX between from 4000_0000H and 4000_0002H.
On Intel Xeon processor 7400 series, there are eight 32-bit special-purpose counters addressable with indices 2-9, ECX[30]=0.
When in protected or virtual 8086 mode, the performance-monitoring counters enabled (PCE) flag in register CR4 restricts the use of the RDPMC instruction as follows. When the PCE flag is set, the RDPMC instruction can be executed at any privilege level; when the flag is clear, the instruction can only be executed at privilege level 0. (When in real-address mode, the RDPMC instruction is always enabled.)
The performance-monitoring counters can also be read with the RDMSR instruction, when executing at privilege level 0.
The performance-monitoring counters are event counters that can be programmed to count events such as the number of instructions decoded, number of interrupts received, or number of cache loads. Chapter 19, “Perfor- mance Monitoring Events,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B, lists the events that can be counted for various processors in the Intel 64 and IA-32 architecture families.
The RDPMC instruction is not a serializing instruction; that is, it does not imply that all the events caused by the preceding instructions have been completed or that events caused by subsequent instructions have not begun. If an exact event count is desired, software must insert a serializing instruction (such as the CPUID instruction) before and/or after the RDPMC instruction.
Performing back-to-back fast reads are not guaranteed to be monotonic. To guarantee monotonicity on back-to- back reads, a serializing instruction must be placed between the two RDPMC instructions.
The RDPMC instruction can execute in 16-bit addressing mode or virtual-8086 mode; however, the full contents of the ECX register are used to select the counter, and the event count is stored in the full EAX and EDX registers. The RDPMC instruction was introduced into the IA-32 Architecture in the Pentium Pro processor and the Pentium processor with MMX technology. The earlier Pentium processors have performance-monitoring counters, but they must be read with the RDMSR instruction.
(* Intel processors that support architectural performance monitoring *)
Most significant counter bit (MSCB) = 47
IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
THEN IF (ECX[30] = 1 and ECX[29:0] in valid fixed-counter range)
EAX ← IA32_FIXED_CTR(ECX)[30:0];
EDX ← IA32_FIXED_CTR(ECX)[MSCB:32];
ELSE IF (ECX[30] = 0 and ECX[29:0] in valid general-purpose counter range)
EAX ← PMC(ECX[30:0])[31:0];
EDX ← PMC(ECX[30:0])[MSCB:32];
ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
#GP(0);
FI;
(* Intel Core 2 Duo processor family and Intel Xeon processor 3000, 5100, 5300, 7400 series*)
Most significant counter bit (MSCB) = 39
IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
THEN IF (ECX[30] = 1 and ECX[29:0] in valid fixed-counter range)
EAX ← IA32_FIXED_CTR(ECX)[30:0];
EDX ← IA32_FIXED_CTR(ECX)[MSCB:32];
ELSE IF (ECX[30] = 0 and ECX[29:0] in valid general-purpose counter range)
EAX ← PMC(ECX[30:0])[31:0];
EDX ← PMC(ECX[30:0])[MSCB:32];
ELSE IF (ECX[30] = 0 and ECX[29:0] in valid special-purpose counter range)
EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
#GP(0);
FI;
(* P6 family processors and Pentium processor with MMX technology *)
IF (ECX = 0 or 1) and ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
THEN
EAX ← PMC(ECX)[31:0];
EDX ← PMC(ECX)[39:32];
ELSE (* ECX is not 0 or 1 or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
#GP(0);
FI;
(* Processors based on Intel NetBurst microarchitecture *)
IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
THEN IF (ECX[30:0] = 0:17)
THEN IF ECX[31] = 0
THEN
EAX ← PMC(ECX[30:0])[31:0]; (* 40-bit read *)
EDX ← PMC(ECX[30:0])[39:32];
ELSE (* ECX[31] = 1*)
THEN
EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
EDX ← 0;
FI;
ELSE IF (*64-bit Intel processor based on Intel NetBurst microarchitecture with L3 *)
THEN IF (ECX[30:0] = 18:25 )
EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
EDX ← 0;
FI;
ELSE (* Invalid PMC index in ECX[30:0], see Table 4-19. *)
GP(0);
FI;
ELSE (* CR4.PCE = 0 and (CPL = 1, 2, or 3) and CR0.PE = 1 *)
#GP(0);
FI;
None.
\#GP(0) If the current privilege level is not 0 and the PCE flag in the CR4 register is clear. If an invalid performance counter index is specified (see Table 4-16).
\#UD If the LOCK prefix is used.
\#GP If an invalid performance counter index is specified (see Table 4-16).
\#UD If the LOCK prefix is used.
\#GP(0) If the PCE flag in the CR4 register is clear. If an invalid performance counter index is specified (see Table 4-16).
\#UD If the LOCK prefix is used.
Same exceptions as in protected mode.
\#GP(0) If the current privilege level is not 0 and the PCE flag in the CR4 register is clear. If an invalid performance counter index is specified (see Table 4-16).
\#UD If the LOCK prefix is used.
Source: Intel Architecture Software Developer's Manual (July 2017)
Generated at: 08/14/17 13:51:18