From 7cec7d8dd2a50af43ed42a80f410dd44e740d026 Mon Sep 17 00:00:00 2001 From: Beeman Strong <97133824+bcstrongx@users.noreply.github.com> Date: Tue, 11 Jun 2024 16:26:19 -0700 Subject: [PATCH] Update charter.adoc Incorporate feedback from the first Performance Event Sampling TG meeting Signed-off-by: Beeman Strong <97133824+bcstrongx@users.noreply.github.com> --- charter.adoc | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/charter.adoc b/charter.adoc index 8af25f7..a64d7ea 100644 --- a/charter.adoc +++ b/charter.adoc @@ -1,12 +1,12 @@ -= Preliminary Performance Sampling TG Charter += Preliminary Performance Event Sampling TG Charter -RISC-V hardware performance monitoring counters (Zihpm) provide support for counting performance events, and, with Sscofpmf, support for basic, interrupt-based performance sampling. However, on most implementations sampling interrupts will skid, such that the resulting trap is taken some number of cycles and/or instructions after the instruction that caused the overflow retires. As a result the PC collected by the profiler will rarely match that of the causal instruction, since the PC will typically advance during the skid period. Other state that a profiler may want to collect (registers, call-stack, counter values, etc) is likely to be overwritten or modified as well. +RISC-V hardware performance monitoring counters (Zihpm) provide support for counting performance events, and, with Sscofpmf, support for basic, interrupt-based performance event sampling. However, on most implementations sampling interrupts will skid, such that the resulting trap is taken some number of cycles and/or instructions after the instruction that caused the overflow retires. As a result the PC collected by the profiler will rarely match that of the causal instruction, since the PC will typically advance during the skid period. Other state that a profiler may want to collect (registers, call-stack, counter values, etc) is likely to be overwritten or modified as well. -The Performance Sampling TG aims to address these limitations by defining two new ISA extensions: +The Performance Event Sampling TG aims to address these limitations by defining two new ISA extensions: -* An extension that enables precise attribution of samples based on select (non-speculative) events to the instruction that caused the counter overflow, despite implementations where the associated sampling interrupt may skid. This will provide more directly actionable information to the user, by precisely identifying the instructions that are most often experiencing performance events. -* An extension that enables sampling of instructions and/or uops, selected at dispatch, with collection of runtime event occurrences and latencies incurred by the instruction/uop. Such samples can be filtered based on instruction/uop type, events incurred, or latencies observed, allowing the user to focus on samples of interest. Further, associated sampling interrupts can be skidless, allowing the user to collect additional sample state (call-stack, register values) reliably. +* An extension that enables precise attribution of samples based on select events (e.g., instruction/uop retirement events) to the instruction that caused the counter overflow, despite implementations where the associated sampling interrupt may skid. This will provide more directly actionable information to the user, by precisely identifying the instructions that are most often experiencing performance events. +* An extension that enables sampling of instructions and/or uops, with collection of runtime event occurrences and latencies incurred by the instruction/uop. Such samples can be filtered based on instruction/uop type, events incurred, or latencies observed, allowing the user to focus on samples of interest. Further, associated sampling interrupts can be skidless, allowing the user to collect additional sample state (call-stack, register values) reliably. -Each extension will be crafted to be implementation-friendly even for high-performance, out-of-order microarchitectures, aiming to require no additional performance overhead beyond that resulting from the handling of sampling interrupts. +Each extension will be crafted to be implementation-friendly even for high-performance, out-of-order microarchitectures, aiming to require no additional performance overhead beyond that resulting from the handling of sampling interrupts. The extensions will be compatible with the H extension, and supoprt RISC-V security objectives. -The TG will prototype the new extensions in Spike or Qemu, along with Linux perf support to demonstrate the end-to-end solution. +The TG will prototype support for the new extensions in Qemu and Linux perf, to demonstrate the usability of the ISA for kernels and tools.