-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[L0] Enable Counter Based Events by default for immediate command lists #1897
[L0] Enable Counter Based Events by default for immediate command lists #1897
Conversation
nrspruit
commented
Jul 25, 2024
- Remove the PVC limitation for enabling counter based events.
Compute Benchmarks level_zero run (with params: ): |
Compute Benchmarks level_zero run (): |
Apologies, but all things requiring SYCL (like perf or e2e tests) are broken, waiting for this to merge. |
… lists -pre-commit PR for oneapi-src/unified-runtime#1897 Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Compute Benchmarks level_zero run (with params: ): |
Compute Benchmarks level_zero run (): Summary
Benchmark Results---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl, mean execution time per 10 kernels
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>Imm-CmdLists-OFF
This PR (20.838 μs) : crit, 0, 20
baseline (22.705 μs) : 0, 22
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>Imm-CmdLists-OFF
This PR (23.536 μs) : crit, 0, 23
baseline (23.606 μs) : 0, 23
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>
This PR (26.697 μs) : crit, 0, 26
baseline (23.62 μs) : 0, 23
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>
This PR (26.543 μs) : crit, 0, 26
baseline (25.476 μs) : 0, 25
- : 0, 0
- : 0, 0
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Hashtable
todayMarker off
dateFormat X
axisFormat %s
section hashtable<br>Imm-CmdLists-OFF
This PR (356.39765 M keys/sec) : crit, 0, 356
baseline (306.262877 M keys/sec) : 0, 306
- : 0, 0
- : 0, 0
section hashtable<br>
This PR (358.792274 M keys/sec) : crit, 0, 358
baseline (360.15055 M keys/sec) : 0, 360
- : 0, 0
- : 0, 0
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Bitcracker
todayMarker off
dateFormat X
axisFormat %s
section bitcracker<br>Imm-CmdLists-OFF
This PR (35.5768 s) : crit, 0, 35
baseline (39.0378 s) : 0, 39
- : 0, 0
- : 0, 0
section bitcracker<br>
This PR (35.6101 s) : crit, 0, 35
baseline (35.6105 s) : 0, 35
- : 0, 0
- : 0, 0
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Easywave
todayMarker off
dateFormat X
axisFormat %s
section easywave<br>Imm-CmdLists-OFF
This PR (478 ms) : crit, 0, 478
baseline (606.0 ms) : 0, 606
- : 0, 0
- : 0, 0
section easywave<br>
This PR (239 ms) : crit, 0, 239
baseline (241.0 ms) : 0, 241
- : 0, 0
- : 0, 0
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench QuickSilver
todayMarker off
dateFormat X
axisFormat %s
section QuickSilver<br>
This PR (117.74 MMS/CTT) : crit, 0, 117
baseline (110.88 MMS/CTT) : 0, 110
- : 0, 0
- : 0, 0
---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Sobel Filter
todayMarker off
dateFormat X
axisFormat %s
section sobel_filter<br>Imm-CmdLists-OFF
This PR (559.986 ms) : crit, 0, 559
baseline (609.227 ms) : 0, 609
- : 0, 0
- : 0, 0
section sobel_filter<br>
This PR (556.645 ms) : crit, 0, 556
baseline (548.773 ms) : 0, 548
- : 0, 0
- : 0, 0
DetailsSubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0) Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0) Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type hashtable Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify Output:hashtable - total time for whole calculation: 0.376595 s hashtableEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=1 Command:/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify Output:hashtable - total time for whole calculation: 0.374082 s bitcracker Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000 Output:---------> BitCracker: BitLocker password cracking tool <--------- ==================================
|
- Remove the PVC limitation for enabling counter based events. Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
3d7c787
to
5b1d9b5
Compare
… lists -pre-commit PR for oneapi-src/unified-runtime#1897 Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>