Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sycl-bench added & ascii bar chart #2115

Merged
merged 25 commits into from
Sep 30, 2024
Merged

Conversation

mateuszpn
Copy link
Contributor

No description provided.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/10993876288

This comment was marked as outdated.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/10994798635

This comment was marked as outdated.

@github-actions github-actions bot added the ci/cd Continuous integration/devliery label Sep 23, 2024
Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11014516706

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11014516706
Job status: success. Test status: success.

Summary

result is better

Performance change in benchmark groups

"Relative perf in group api: 81.379%"
Benchmark This PR baseline Relative perf Change -
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 4.594 4.657 101.37% 1.37% .
api_overhead_benchmark_ur SubmitKernel in order 28.693 28.869 100.61% 0.61% .
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 3.532 3.526 99.83% -0.17% .
api_overhead_benchmark_sycl SubmitKernel out of order 48.233 47.825 99.15% -0.85% .
api_overhead_benchmark_ur SubmitKernel out of order 31.633 18.474 58.40% -41.60% --------
api_overhead_benchmark_sycl SubmitKernel in order 48.107 23.699 49.26% -50.74% ----------
"Relative perf in group memory: 97.833%"
Benchmark This PR baseline Relative perf Change -
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 9.776 9.909 101.36% 1.36% .
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 1.799 1.804 100.28% 0.28% .
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 467.233 466.567 99.86% -0.14% .
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 278.122 251.026 90.26% -9.74% --
"Relative perf in group miscellaneous: 100.131%"
Benchmark This PR baseline Relative perf Change -
miscellaneous_benchmark_sycl VectorSum 861.522 862.654 100.13% 0.13% .
"Relative perf in group Velocity-Bench: 102.832%"
Benchmark This PR baseline Relative perf Change -
Velocity-Bench CudaSift 244.437 289.313 118.36% 18.36% ++++
Velocity-Bench QuickSilver 91.57 89.65 102.14% 2.14% .
Velocity-Bench Easywave 452 457.0 101.11% 1.11% .
Velocity-Bench Bitcracker 35.6364 35.7716 100.38% 0.38% .
Velocity-Bench Hashtable 203.485247 204.618721 99.45% -0.55% .
Velocity-Bench Sobel Filter 1000.38 969.446 96.91% -3.09% -
"Relative perf in group Runtime: 100.000%"
Benchmark This PR baseline Relative perf Change -
Runtime_BlockedTransform_iter_64_blocksize_256 0.34099999999999997 -
Runtime_BlockedTransform_iter_512_blocksize_256 0.081 -
Runtime_BlockedTransform_iter_256_blocksize_256 0.08399999999999999 -
Runtime_BlockedTransform_iter_128_blocksize_256 0.16 -
Runtime_IndependentDAGTaskThroughput_BasicParallelFor 5.795 -
Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor 5.5760000000000005 -
Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor 5.589 -
Runtime_IndependentDAGTaskThroughput_SingleTask 6.476 -
Runtime_DAGTaskThroughput_SingleTask 6.513 -
Runtime_DAGTaskThroughput_BasicParallelFor 5.984 -
Runtime_DAGTaskThroughput_NDRangeParallelFor 4.861 -
Runtime_DAGTaskThroughput_HierarchicalParallelFor 5.252 -
"Relative perf in group MicroBench: 100.000%"
Benchmark This PR baseline Relative perf Change -
MicroBench_LocalMem_int32_4096 0.229 -
MicroBench_LocalMem_fp32_4096 0.2 -
MicroBench_L2_int32_4 0.026 -
MicroBench_L2_int32_1 0.033 -
MicroBench_L2_int32_16 0.026 -
MicroBench_L2_int32_2 0.027 -
MicroBench_L2_fp32_2 0.026 -
MicroBench_L2_int32_8 0.026 -
MicroBench_L2_fp32_4 0.026 -
MicroBench_L2_fp32_8 0.025 -
MicroBench_L2_fp32_16 0.025 -
MicroBench_L2_fp32_1 0.025 -
MicroBench_Arith_int32_512 0.073 -
MicroBench_Arith_fp32_512 0.032 -
MicroBench_sf_fp32_16 0.026 -
"Relative perf in group Pattern: 100.000%"
Benchmark This PR baseline Relative perf Change -
Pattern_Reduction_Hierarchical_int32 0.052 -
Pattern_Reduction_NDRange_fp32 0.026 -
Pattern_Reduction_NDRange_int64 0.052 -
Pattern_Reduction_NDRange_int32 0.075 -
Pattern_Reduction_Hierarchical_int64 0.051 -
Pattern_Reduction_Hierarchical_fp32 0.052 -
Pattern_SegmentedReduction_NDRange_fp32 0.014 -
Pattern_SegmentedReduction_Hierarchical_int32 0.028 -
Pattern_SegmentedReduction_Hierarchical_int64 0.029 -
Pattern_SegmentedReduction_NDRange_int16 0.045000000000000005 -
Pattern_SegmentedReduction_NDRange_int32 0.026 -
Pattern_SegmentedReduction_Hierarchical_fp32 0.030000000000000002 -
Pattern_SegmentedReduction_Hierarchical_int16 0.030000000000000002 -
Pattern_SegmentedReduction_NDRange_int64 0.016 -
"Relative perf in group ScalarProduct: 100.000%"
Benchmark This PR baseline Relative perf Change -
ScalarProduct_NDRange_int32 0.15100000000000002 -
ScalarProduct_Hierarchical_int32 0.062 -
ScalarProduct_NDRange_fp32 0.04 -
ScalarProduct_Hierarchical_int64 0.063 -
ScalarProduct_NDRange_int64 0.098 -
ScalarProduct_Hierarchical_fp32 0.059 -
"Relative perf in group SYCL2020: 100.000%"
Benchmark This PR baseline Relative perf Change -
SYCL2020_Accessors_Latency_fp32_in_order__ 68.7 -
SYCL2020_Accessors_Latency_fp32_out_of_order__ 70.866 -
"Relative perf in group USM: 100.000%"
Benchmark This PR baseline Relative perf Change -
USM_Latency_fp32_in_order__ 33.709 -
USM_Latency_fp32_out_of_order__ 46.684000000000005 -
USM_Allocation_latency_fp32_device 0.008 -
USM_Allocation_latency_fp32_shared 0.11900000000000001 -
USM_Allocation_latency_fp32_host 0.002 -
USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch 1.7619999999999998 -
USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch 15.392 -
USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch 14.144 -
USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch 15.306999999999999 -
USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch 3.225 -
USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch 13.719999999999999 -
USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch 1.868 -
USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch 3.098 -
USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1 0.438 -
USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1 0.015000000000000001 -
USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1 0.019 -
USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1 0.011 -
"Relative perf in group VectorAddition: 100.000%"
Benchmark This PR baseline Relative perf Change -
VectorAddition_fp32 0.032 -
VectorAddition_int64 0.04 -
VectorAddition_int32 0.037 -
"Relative perf in group Polybench: 100.000%"
Benchmark This PR baseline Relative perf Change -
Polybench_2DConvolution 0.23 -
Polybench_2mm 1.239 -
Polybench_3mm 1.745 -
Polybench_Atax 6.9030000000000005 -
Polybench_Bicg 5.123 -
Polybench_Correlation 94.324 -
Polybench_Covariance 94.097 -
Polybench_Gemm 3.965 -
Polybench_Gesummv 7.303 -
Polybench_Gramschmidt 285.066 -
Polybench_Mvt 3.633 -
Polybench_Syr2k 6.32 -
Polybench_Syrk 3.209 -
"Relative perf in group ReductionAtomic: 100.000%"
Benchmark This PR baseline Relative perf Change -
ReductionAtomic_int32 0.041999999999999996 -
ReductionAtomic_fp32 0.041 -
ReductionAtomic_fp64 0.043000000000000003 -
ReductionAtomic_int64 0.041 -
"Relative perf in group Kmeans: 100.000%"
Benchmark This PR baseline Relative perf Change -
Kmeans_fp32 1.792 -
"Relative perf in group LinearRegressionCoeff: 100.000%"
Benchmark This PR baseline Relative perf Change -
LinearRegressionCoeff_fp32 1.161 -
"Relative perf in group LinearRegression: 100.000%"
Benchmark This PR baseline Relative perf Change -
LinearRegression_fp32 0.357 -
"Relative perf in group MatmulChain: 100.000%"
Benchmark This PR baseline Relative perf Change -
MatmulChain 11.030999999999999 -
"Relative perf in group MolecularDynamics: 100.000%"
Benchmark This PR baseline Relative perf Change -
MolecularDynamics 0.066 -

Details

api_overhead_benchmark_sycl SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.233,47.859,7.11%,45.985,512.959,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.107,47.652,7.17%,45.441,509.205,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),31.633,31.232,7.50%,29.732,499.360,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),28.693,28.302,8.66%,26.635,488.900,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),467.233,463.591,5.87%,449.449,943.777,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),278.122,246.739,24.78%,238.868,727.412,[CPU],[us]

memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),9.776,9.580,18.42%,7.730,127.964,[CPU],[us]

memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),1.799,1.903,11.31%,0.204,2.047,[CPU],[GB/s]

api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),4.594,4.433,17.23%,3.986,125.348,[CPU],[us]

api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),3.532,3.402,15.09%,3.189,66.332,[CPU],[us]

miscellaneous_benchmark_sycl VectorSum

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),861.522,861.843,0.45%,820.001,887.997,[GPU],bw [GB/s]

Velocity-Bench Hashtable

Environment Variables:

Command:

/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.659594 s
203.485247 million keys/second

Velocity-Bench Bitcracker

Environment Variables:

Command:

/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.0153216 s
bitcracker - total time for whole calculation: 35.6364 s

Velocity-Bench CudaSift

Environment Variables:

Command:

/home/test-user/bench_workdir/cudaSift/cudaSift

Output:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1265 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1105 1255 30.0027% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1274 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1123 1264 30.4914% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1215 1258 32.9894% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1226 1258 33.2881% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1125 1268 30.5458% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1260 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1116 1247 30.3014% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1245 1277 33.804% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1264 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1110 1262 30.1385% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1262 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1270 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1260 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1212 1264 32.908% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1113 1273 30.2199% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1219 1264 33.098% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1103 1260 29.9484% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1110 1261 30.1385% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1081 1268 29.3511% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1261 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1197 1252 32.5007% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1270 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1097 1257 29.7855% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1136 1266 30.8444% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1263 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1260 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1262 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1144 1266 31.0616% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1072 1269 29.1067% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1220 1257 33.1252% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1260 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1261 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1258 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1117 1273 30.3285% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1072 1269 29.1067% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1129 1262 30.6544% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1091 1268 29.6226% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1272 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1114 1268 30.2471% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1264 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1264 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1094 1256 29.704% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1241 1275 33.6954% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1255 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1270 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1106 1269 30.0299% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1240 1271 33.6682% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1260 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 244.437 ms

Velocity-Bench Easywave

Environment Variables:

Command:

/home/test-user/bench_workdir/easywave/easyWave_sycl -grid /home/test-user/bench_workdir/data/easywave/examples/e2Asean.grd -source /home/test-user/bench_workdir/data/easywave/examples/BengkuluSept2007.flt -time 120

Output:

MAIN: Starting SYCL main program
SYCL: SYCL Queue initialization successful
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.29735+27)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero
MAIN: Program successfully completed

Velocity-Bench QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 4.450710e-01 8.290370e-01 0.000000e+00
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 3.736490e-01 9.709170e-01 0.000000e+00
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 3.759800e-01 9.782640e-01 0.000000e+00
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 3.757500e-01 1.062952e+00 0.000000e+00
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 3.379670e-01 1.027549e+00 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 3.367450e-01 9.797030e-01 0.000000e+00
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 3.443810e-01 9.786410e-01 0.000000e+00
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 3.367370e-01 1.019806e+00 0.000000e+00
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 3.358520e-01 1.019163e+00 0.000000e+00
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 3.344560e-01 9.724040e-01 0.000000e+00

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.344e+07 1.344e+07 1.344e+07 0.000e+00 100.00
cycleInit 10 3.597e+06 3.597e+06 3.597e+06 0.000e+00 100.00
cycleTracking 10 9.838e+06 9.838e+06 9.838e+06 0.000e+00 100.00
cycleTracking_Kernel 104 4.941e+06 4.941e+06 4.941e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.156e+05 2.156e+05 2.156e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 4.040e+02 4.040e+02 4.040e+02 0.000e+00 100.00
Figure Of Merit 91.57 [Num Mega Segments / Cycle Tracking Time]

Velocity-Bench Sobel Filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 14.9514 s
sobelfilter - total time for whole calculation: 1.00038 s

Runtime_BlockedTransform_iter_64_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=512

Output:

Runtime_BlockedTransform_iter_512_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=512

Output:

Runtime_BlockedTransform_iter_256_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=512

Output:

Runtime_BlockedTransform_iter_128_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=512

Output:

Runtime_IndependentDAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=512

Output:

Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=512

Output:

Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=512

Output:

Runtime_IndependentDAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=512

Output:

Runtime_DAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=512

Output:

Runtime_DAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=512

Output:

Runtime_DAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=512

Output:

Runtime_DAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=512

Output:

MicroBench_LocalMem_int32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

MicroBench_LocalMem_fp32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

MicroBench_L2_int32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_int32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_int32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_int32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_fp32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_int32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_fp32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_fp32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

MicroBench_L2_fp32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

Pattern_Reduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

Pattern_Reduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

Pattern_Reduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

Pattern_Reduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

Pattern_Reduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

Pattern_Reduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

ScalarProduct_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

ScalarProduct_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

ScalarProduct_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

ScalarProduct_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

ScalarProduct_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

ScalarProduct_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

Pattern_SegmentedReduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_NDRange_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_Hierarchical_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

Pattern_SegmentedReduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

SYCL2020_Accessors_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

USM_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

SYCL2020_Accessors_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

USM_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

USM_Allocation_latency_fp32_device

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

USM_Allocation_latency_fp32_shared

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

USM_Allocation_latency_fp32_host

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

VectorAddition_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

VectorAddition_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

VectorAddition_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

Polybench_2DConvolution

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2DConvolution --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2DConvolution.csv

Output:

Polybench_2mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2mm.csv --size=512

Output:

Polybench_3mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/3mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/3mm.csv --size=512

Output:

MicroBench_Arith_int32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

MicroBench_Arith_fp32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

Polybench_Atax

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atax --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Atax.csv --size=8192

Output:

ReductionAtomic_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

ReductionAtomic_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

ReductionAtomic_fp64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

ReductionAtomic_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

Polybench_Bicg

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/bicg --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Bicg.csv --size=8192

Output:

Polybench_Correlation

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/correlation --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Correlation.csv --size=512

Output:

Polybench_Covariance

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/covariance --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Covariance.csv --size=512

Output:

Polybench_Gemm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gemm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gemm.csv --size=1024

Output:

Polybench_Gesummv

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gesummv --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gesummv.csv --size=8192

Output:

Polybench_Gramschmidt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gramschmidt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gramschmidt.csv --size=512

Output:

Kmeans_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/kmeans --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Kmeans.csv --size=67108864

Output:

LinearRegressionCoeff_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_coeff --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegressionCoeff.csv

Output:

LinearRegression_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_error --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegression.csv

Output:

MatmulChain

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/matmulchain --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MatmulChain.csv --size=1024

Output:

MolecularDynamics

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mol_dyn --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MolecularDynamics.csv

Output:

Polybench_Mvt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mvt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Mvt.csv --size=16384

Output:

MicroBench_sf_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/sf --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/sf_16.csv --size=--size=100000000

Output:

Polybench_Syr2k

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syr2k --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syr2k.csv --size=1024

Output:

Polybench_Syrk

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syrk --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syrk.csv --size=1024

Output:

Copy link

Compute Benchmarks level_zero run (with params: --save baseline):
https://github.com/oneapi-src/unified-runtime/actions/runs/11016474513

@oneapi-src oneapi-src deleted a comment from github-actions bot Sep 24, 2024
@oneapi-src oneapi-src deleted a comment from github-actions bot Sep 24, 2024

This comment was marked as outdated.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11017285607

This comment was marked as outdated.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11030990355

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11053175157

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11053175157
Job status: cancelled. Test status: cancelled.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11053480333

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11053480333
Job status: success. Test status: success.

Summary

Total 106 benchmarks in mean.
Geomean 99.610%.
Improved 12 Regressed 26 (treshold 0.50%)

(result is better)

Performance change in benchmark groups

Relative perf in group api: 100.073%
Benchmark This PR baseline Relative perf Change -
api_overhead_benchmark_ur SubmitKernel out of order 27.542 μs 31.184 μs 113.22% 13.22% ++++++++
api_overhead_benchmark_sycl SubmitKernel out of order 48.114 μs 47.709 μs 99.16% -0.84% -
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 3.648 μs 3.569 μs 97.83% -2.17% -
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 4.726 μs 4.6 μs 97.33% -2.67% --
api_overhead_benchmark_sycl SubmitKernel in order 48.34 μs 47.033 μs 97.30% -2.70% --
api_overhead_benchmark_ur SubmitKernel in order 29.776 μs 28.75 μs 96.55% -3.45% --
Relative perf in group memory: 96.162%
Benchmark This PR baseline Relative perf Change -
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 476.541 μs 468.699 μs 98.35% -1.65% -
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 1.794 μs 1.725 μs 96.15% -3.85% --
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 262.023 μs 249.365 μs 95.17% -4.83% ---
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 10.297 μs 9.783 μs 95.01% -4.99% ---
Relative perf in group miscellaneous: 100.082%
Benchmark This PR baseline Relative perf Change -
miscellaneous_benchmark_sycl VectorSum 862.046 μs 862.755 μs 100.08% 0.08% .
Relative perf in group Velocity-Bench: 97.533%
Benchmark This PR baseline Relative perf Change -
Velocity-Bench Sobel Filter 973.367 ms 997.159 ms 102.44% 2.44% ++
Velocity-Bench Hashtable 205.529796 M keys/sec 204.65781 M keys/sec 100.43% 0.43% .
Velocity-Bench QuickSilver 89.75 MMS/CTT 89.68 MMS/CTT 100.08% 0.08% .
Velocity-Bench Bitcracker 35.5909 s 35.4695 s 99.66% -0.34% .
Velocity-Bench CudaSift 285.429 ms 239.448 ms 83.89% -16.11% ----------
Velocity-Bench Easywave - 453.0 ms
Relative perf in group Runtime: 99.969%
Benchmark This PR baseline Relative perf Change -
Runtime_BlockedTransform_iter_128_blocksize_256 0.156 ms 0.161 ms 103.21% 3.21% ++
Runtime_BlockedTransform_iter_512_blocksize_256 0.079 ms 0.081 ms 102.53% 2.53% ++
Runtime_BlockedTransform_iter_256_blocksize_256 0.08399999999999999 ms 0.085 ms 101.19% 1.19% +
Runtime_BlockedTransform_iter_64_blocksize_256 0.34099999999999997 ms 0.34099999999999997 ms 100.00% 0.00% .
Runtime_IndependentDAGTaskThroughput_BasicParallelFor 5.802 ms 5.792000000000001 ms 99.83% -0.17% .
Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor 5.588 ms 5.5760000000000005 ms 99.79% -0.21% .
Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor 5.603000000000001 ms 5.588 ms 99.73% -0.27% .
Runtime_DAGTaskThroughput_SingleTask 6.548 ms 6.4590000000000005 ms 98.64% -1.36% -
Runtime_DAGTaskThroughput_NDRangeParallelFor 4.927 ms 4.856 ms 98.56% -1.44% -
Runtime_DAGTaskThroughput_HierarchicalParallelFor 5.3309999999999995 ms 5.25 ms 98.48% -1.52% -
Runtime_IndependentDAGTaskThroughput_SingleTask 6.574 ms 6.467 ms 98.37% -1.63% -
Runtime_DAGTaskThroughput_BasicParallelFor 6.071 ms 5.963 ms 98.22% -1.78% -
Runtime_BlockedTransform_iter_64_blocksize_4096 0.395 ms -
Runtime_BlockedTransform_iter_128_blocksize_2048 0.755 ms -
Runtime_BlockedTransform_iter_128_blocksize_16384 2.2880000000000003 ms -
Runtime_BlockedTransform_iter_256_blocksize_32768 2.3640000000000003 ms -
Runtime_BlockedTransform_iter_512_blocksize_32768 2.421 ms -
Runtime_BlockedTransform_iter_64_blocksize_262144 2.367 ms -
Runtime_BlockedTransform_iter_128_blocksize_65536 2.543 ms -
Runtime_BlockedTransform_iter_512_blocksize_4096 0.5479999999999999 ms -
Runtime_BlockedTransform_iter_64_blocksize_65536 2.591 ms -
Runtime_BlockedTransform_iter_512_blocksize_16384 2.686 ms -
Runtime_BlockedTransform_iter_512_blocksize_65536 2.608 ms -
Runtime_BlockedTransform_iter_256_blocksize_2048 0.758 ms -
Runtime_BlockedTransform_iter_256_blocksize_8192 0.525 ms -
Runtime_BlockedTransform_iter_128_blocksize_4096 0.46900000000000003 ms -
Runtime_BlockedTransform_iter_128_blocksize_262144 2.469 ms -
Runtime_BlockedTransform_iter_512_blocksize_131072 2.559 ms -
Runtime_BlockedTransform_iter_64_blocksize_8192 0.519 ms -
Runtime_BlockedTransform_iter_256_blocksize_1024 1.109 ms -
Runtime_BlockedTransform_iter_64_blocksize_32768 2.421 ms -
Runtime_BlockedTransform_iter_64_blocksize_524288 2.492 ms -
Runtime_BlockedTransform_iter_256_blocksize_262144 2.519 ms -
Runtime_BlockedTransform_iter_256_blocksize_524288 2.578 ms -
Runtime_BlockedTransform_iter_512_blocksize_262144 2.5799999999999996 ms -
Runtime_BlockedTransform_iter_128_blocksize_1024 1.129 ms -
Runtime_BlockedTransform_iter_64_blocksize_131072 2.519 ms -
Runtime_BlockedTransform_iter_64_blocksize_2048 0.758 ms -
Runtime_BlockedTransform_iter_512_blocksize_8192 0.524 ms -
Runtime_BlockedTransform_iter_64_blocksize_16384 2.241 ms -
Runtime_BlockedTransform_iter_256_blocksize_4096 0.41100000000000003 ms -
Runtime_BlockedTransform_iter_128_blocksize_8192 0.522 ms -
Runtime_BlockedTransform_iter_256_blocksize_65536 2.513 ms -
Runtime_BlockedTransform_iter_256_blocksize_16384 2.5509999999999997 ms -
Runtime_BlockedTransform_iter_128_blocksize_131072 2.424 ms -
Runtime_BlockedTransform_iter_512_blocksize_1024 1.31 ms -
Runtime_BlockedTransform_iter_512_blocksize_2048 0.746 ms -
Runtime_BlockedTransform_iter_512_blocksize_524288 2.748 ms -
Runtime_BlockedTransform_iter_128_blocksize_524288 2.5730000000000004 ms -
Runtime_BlockedTransform_iter_64_blocksize_1024 1.5250000000000001 ms -
Runtime_BlockedTransform_iter_128_blocksize_32768 2.4499999999999997 ms -
Runtime_BlockedTransform_iter_256_blocksize_131072 2.469 ms -
Relative perf in group MicroBench: 100.262%
Benchmark This PR baseline Relative perf Change -
MicroBench_sf_fp32_16 0.025 ms 0.026 ms 104.00% 4.00% ++
MicroBench_LocalMem_fp32_4096 0.2 ms 0.2 ms 100.00% 0.00% .
MicroBench_LocalMem_int32_4096 0.229 ms 0.229 ms 100.00% 0.00% .
MicroBench_L2_fp32_16 0.025 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_int32_2 0.027 ms 0.027 ms 100.00% 0.00% .
MicroBench_L2_int32_1 0.033 ms 0.033 ms 100.00% 0.00% .
MicroBench_L2_int32_16 0.026 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_2 0.026 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_1 0.025 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_fp32_4 0.026 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_8 0.025 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_int32_8 0.026 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_4 0.026 ms 0.026 ms 100.00% 0.00% .
MicroBench_Arith_fp32_512 0.032 ms 0.032 ms 100.00% 0.00% .
MicroBench_Arith_int32_512 0.073 ms 0.073 ms 100.00% 0.00% .
Relative perf in group Pattern: 100.025%
Benchmark This PR baseline Relative perf Change -
Pattern_Reduction_Hierarchical_int64 0.05 ms 0.051 ms 102.00% 2.00% +
Pattern_Reduction_Hierarchical_fp32 0.051 ms 0.052 ms 101.96% 1.96% +
Pattern_Reduction_NDRange_int64 0.052 ms 0.052 ms 100.00% 0.00% .
Pattern_Reduction_NDRange_fp32 0.026 ms 0.026 ms 100.00% 0.00% .
Pattern_Reduction_Hierarchical_int32 0.052 ms 0.052 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int16 0.030000000000000002 ms 0.030000000000000002 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int64 0.029 ms 0.029 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_fp32 0.014 ms 0.014 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_fp32 0.030000000000000002 ms 0.030000000000000002 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int64 0.016 ms 0.016 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int32 0.028 ms 0.028 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int32 0.027 ms 0.027 ms 100.00% 0.00% .
Pattern_Reduction_NDRange_int32 0.076 ms 0.075 ms 98.68% -1.32% -
Pattern_SegmentedReduction_NDRange_int16 0.045000000000000005 ms 0.044 ms 97.78% -2.22% -
Relative perf in group ScalarProduct: 99.774%
Benchmark This PR baseline Relative perf Change -
ScalarProduct_NDRange_int32 0.15 ms 0.15100000000000002 ms 100.67% 0.67% .
ScalarProduct_Hierarchical_fp32 0.059 ms 0.059 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int64 0.063 ms 0.063 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int32 0.062 ms 0.062 ms 100.00% 0.00% .
ScalarProduct_NDRange_fp32 0.04 ms 0.04 ms 100.00% 0.00% .
ScalarProduct_NDRange_int64 0.1 ms 0.098 ms 98.00% -2.00% -
Relative perf in group SYCL2020: 100.102%
Benchmark This PR baseline Relative perf Change -
SYCL2020_Accessors_Latency_fp32_in_order__ 68.47200000000001 ms 68.626 ms 100.22% 0.22% .
SYCL2020_Accessors_Latency_fp32_out_of_order__ 70.855 ms 70.84 ms 99.98% -0.02% .
Relative perf in group USM: 100.093%
Benchmark This PR baseline Relative perf Change -
USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1 0.438 ms 0.46 ms 105.02% 5.02% +++
USM_Allocation_latency_fp32_shared 0.118 ms 0.12000000000000001 ms 101.69% 1.69% +
USM_Latency_fp32_in_order__ 33.717999999999996 ms 33.717999999999996 ms 100.00% 0.00% .
USM_Allocation_latency_fp32_host 0.002 ms 0.002 ms 100.00% 0.00% .
USM_Allocation_latency_fp32_device 0.008 ms 0.008 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1 0.011 ms 0.011 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1 0.015000000000000001 ms 0.015000000000000001 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1 0.019 ms 0.019 ms 100.00% 0.00% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch 15.334 ms 15.306999999999999 ms 99.82% -0.18% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch 14.174000000000001 ms 14.144 ms 99.79% -0.21% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch 13.752 ms 13.719999999999999 ms 99.77% -0.23% .
USM_Latency_fp32_out_of_order__ 46.829 ms 46.684000000000005 ms 99.69% -0.31% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch 15.483 ms 15.415 ms 99.56% -0.44% .
USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch 1.877 ms 1.868 ms 99.52% -0.48% .
USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch 3.111 ms 3.09 ms 99.32% -0.68% .
USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch 1.7930000000000001 ms 1.778 ms 99.16% -0.84% -
USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch 3.278 ms 3.225 ms 98.38% -1.62% -
Relative perf in group VectorAddition: 99.115%
Benchmark This PR baseline Relative perf Change -
VectorAddition_fp32 0.033 ms 0.033 ms 100.00% 0.00% .
VectorAddition_int64 0.043000000000000003 ms 0.043000000000000003 ms 100.00% 0.00% .
VectorAddition_int32 0.038 ms 0.037 ms 97.37% -2.63% --
Relative perf in group Polybench: 99.808%
Benchmark This PR baseline Relative perf Change -
Polybench_Gramschmidt 285.066 ms 285.08 ms 100.00% 0.00% .
Polybench_2DConvolution 0.229 ms 0.229 ms 100.00% 0.00% .
Polybench_2mm 1.239 ms 1.239 ms 100.00% 0.00% .
Polybench_Atax 6.904000000000001 ms 6.9030000000000005 ms 99.99% -0.01% .
Polybench_Bicg 5.127 ms 5.126 ms 99.98% -0.02% .
Polybench_3mm 1.747 ms 1.745 ms 99.89% -0.11% .
Polybench_Gesummv 7.316999999999999 ms 7.303 ms 99.81% -0.19% .
Polybench_Covariance 94.47800000000001 ms 94.253 ms 99.76% -0.24% .
Polybench_Mvt 3.642 ms 3.633 ms 99.75% -0.25% .
Polybench_Syrk 3.218 ms 3.206 ms 99.63% -0.37% .
Polybench_Syr2k 6.341 ms 6.3020000000000005 ms 99.38% -0.62% .
Polybench_Correlation 94.979 ms 94.324 ms 99.31% -0.69% .
Polybench_Gemm - 3.965 ms
Relative perf in group ReductionAtomic: 100.619%
Benchmark This PR baseline Relative perf Change -
ReductionAtomic_fp32 0.04 ms 0.041 ms 102.50% 2.50% ++
ReductionAtomic_fp64 0.043000000000000003 ms 0.043000000000000003 ms 100.00% 0.00% .
ReductionAtomic_int32 0.041999999999999996 ms 0.041999999999999996 ms 100.00% 0.00% .
ReductionAtomic_int64 0.041 ms 0.041 ms 100.00% 0.00% .
Relative perf in group Kmeans: 99.889%
Benchmark This PR baseline Relative perf Change -
Kmeans_fp32 1.794 ms 1.792 ms 99.89% -0.11% .
Relative perf in group LinearRegressionCoeff: 90.163%
Benchmark This PR baseline Relative perf Change -
LinearRegressionCoeff_fp32 1.3519999999999999 ms 1.219 ms 90.16% -9.84% ------
Relative perf in group LinearRegression: 98.892%
Benchmark This PR baseline Relative perf Change -
LinearRegression_fp32 0.361 ms 0.357 ms 98.89% -1.11% -
Relative perf in group MatmulChain: 99.918%
Benchmark This PR baseline Relative perf Change -
MatmulChain 11.037999999999998 ms 11.029 ms 99.92% -0.08% .
Relative perf in group MolecularDynamics: 100.000%
Benchmark This PR baseline Relative perf Change -
MolecularDynamics 0.066 ms 0.066 ms 100.00% 0.00% .

Details

Benchmark details - environment, command, output...
api_overhead_benchmark_sycl SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.114,47.709,7.34%,45.585,523.753,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.340,47.901,6.96%,45.573,559.868,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),27.542,31.584,29.44%,13.791,454.004,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),29.776,29.826,7.41%,18.398,469.800,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),476.541,473.683,5.10%,419.090,969.528,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),262.023,230.721,25.26%,224.741,925.082,[CPU],[us]

memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),10.297,9.379,24.02%,7.638,190.649,[CPU],[us]

memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),1.794,1.900,11.75%,0.203,2.062,[CPU],[GB/s]

api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),4.726,4.595,14.43%,4.068,155.288,[CPU],[us]

api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),3.648,3.613,6.62%,3.396,25.893,[CPU],[us]

miscellaneous_benchmark_sycl VectorSum

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),862.046,862.730,0.52%,818.667,888.310,[GPU],bw [GB/s]

Velocity-Bench Hashtable

Environment Variables:

Command:

/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.653033 s
205.529796 million keys/second

Velocity-Bench Bitcracker

Environment Variables:

Command:

/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.0148724 s
bitcracker - total time for whole calculation: 35.5909 s

Velocity-Bench CudaSift

Environment Variables:

Command:

/home/test-user/bench_workdir/cudaSift/cudaSift

Output:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1115 1262 30.2742% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1117 1259 30.3285% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1129 1249 30.6544% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1266 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1101 1266 29.8941% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1256 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1166 1261 31.659% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1269 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1265 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1240 1275 33.6682% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1240 1272 33.6682% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1260 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1221 1255 33.1523% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1217 1268 33.0437% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1238 1272 33.6139% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1266 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1226 1263 33.2881% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1138 1269 30.8987% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1261 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1266 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1266 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1154 1269 31.3332% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1268 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1135 1268 30.8173% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1170 1258 31.7676% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1262 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1268 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1262 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1113 1255 30.2199% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1217 1267 33.0437% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1223 1259 33.2066% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1268 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1263 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1098 1258 29.8127% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1256 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1220 1254 33.1252% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1267 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1262 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1269 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1265 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1219 1252 33.098% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1266 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1257 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1267 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1106 1277 30.0299% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1226 1259 33.2881% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1261 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1061 1261 28.808% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1279 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1258 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 285.429 ms

Velocity-Bench QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 6.943580e-01 8.482460e-01 0.000000e+00
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 5.738730e-01 9.902290e-01 0.000000e+00
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 5.675980e-01 9.974600e-01 0.000000e+00
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 5.877140e-01 1.109485e+00 1.000000e-06
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 5.303710e-01 1.041517e+00 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 5.287790e-01 9.964440e-01 0.000000e+00
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 5.219520e-01 9.962480e-01 0.000000e+00
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 5.218450e-01 1.036917e+00 0.000000e+00
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 3.404530e-01 1.030386e+00 0.000000e+00
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 5.217800e-01 9.910850e-01 0.000000e+00

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.543e+07 1.543e+07 1.543e+07 0.000e+00 100.00
cycleInit 10 5.389e+06 5.389e+06 5.389e+06 0.000e+00 100.00
cycleTracking 10 1.004e+07 1.004e+07 1.004e+07 0.000e+00 100.00
cycleTracking_Kernel 104 4.943e+06 4.943e+06 4.943e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.910e+05 2.910e+05 2.910e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 7.940e+02 7.940e+02 7.940e+02 0.000e+00 100.00
Figure Of Merit 89.75 [Num Mega Segments / Cycle Tracking Time]

Velocity-Bench Sobel Filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 10.6564 s
sobelfilter - total time for whole calculation: 0.973367 s

Runtime_BlockedTransform_iter_64_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000378', '0.000395', '0.000341', '0.000341 0.000395 0.000399', '0.000032', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000728', '0.000755', '0.000663', '0.000663 0.000755 0.000767', '0.000057', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002376', '0.002288', '0.002252', '0.002252 0.002288 0.002587', '0.000184', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.002352', '0.000341', '0.000186', '0.000186 0.000341 0.006528', '0.003618', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002597', '0.002364', '0.002245', '0.002245 0.002364 0.003182', '0.000510', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002499', '0.002421', '0.002141', '0.002141 0.002421 0.002934', '0.000402', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002654', '0.002367', '0.002305', '0.002305 0.002367 0.003289', '0.000551', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002574', '0.002543', '0.002440', '0.002440 0.002543 0.002739', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000529', '0.000548', '0.000429', '0.000429 0.000548 0.000610', '0.000092', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000207', '0.000156', '0.000122', '0.000122 0.000156 0.000345', '0.000120', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002824', '0.002591', '0.002512', '0.002512 0.002591 0.003370', '0.000474', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002771', '0.002686', '0.002300', '0.002300 0.002686 0.003328', '0.000519', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002577', '0.002608', '0.002157', '0.002157 0.002608 0.002967', '0.000406', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000762', '0.000758', '0.000755', '0.000755 0.000758 0.000772', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000524', '0.000525', '0.000505', '0.000505 0.000525 0.000541', '0.000018', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000489', '0.000469', '0.000453', '0.000453 0.000469 0.000544', '0.000048', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002465', '0.002469', '0.002358', '0.002358 0.002469 0.002569', '0.000105', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000116', '0.000079', '0.000076', '0.000076 0.000079 0.000192', '0.000066', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002730', '0.002559', '0.002296', '0.002296 0.002559 0.003335', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000556', '0.000519', '0.000500', '0.000500 0.000519 0.000650', '0.000082', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.001132', '0.001109', '0.001085', '0.001085 0.001109 0.001203', '0.000063', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000095', '0.000084', '0.000077', '0.000077 0.000084 0.000123', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002445', '0.002421', '0.002410', '0.002410 0.002421 0.002506', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002515', '0.002492', '0.002465', '0.002465 0.002492 0.002589', '0.000065', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002657', '0.002519', '0.002191', '0.002191 0.002519 0.003261', '0.000548', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002673', '0.002578', '0.002186', '0.002186 0.002578 0.003254', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002683', '0.002580', '0.002427', '0.002427 0.002580 0.003041', '0.000320', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.001159', '0.001129', '0.001076', '0.001076 0.001129 0.001273', '0.000102', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002705', '0.002519', '0.002329', '0.002329 0.002519 0.003268', '0.000496', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000765', '0.000758', '0.000752', '0.000752 0.000758 0.000785', '0.000018', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000596', '0.000524', '0.000506', '0.000506 0.000524 0.000759', '0.000141', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002506', '0.002241', '0.002218', '0.002218 0.002241 0.003058', '0.000479', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000431', '0.000411', '0.000399', '0.000399 0.000411 0.000485', '0.000047', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000524', '0.000522', '0.000507', '0.000507 0.000522 0.000541', '0.000017', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002559', '0.002513', '0.002472', '0.002472 0.002513 0.002691', '0.000116', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002923', '0.002551', '0.002523', '0.002523 0.002551 0.003697', '0.000670', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002449', '0.002424', '0.002306', '0.002306 0.002424 0.002617', '0.000157', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.001319', '0.001310', '0.001308', '0.001308 0.001310 0.001337', '0.000016', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000735', '0.000746', '0.000685', '0.000685 0.000746 0.000775', '0.000046', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002712', '0.002748', '0.002529', '0.002529 0.002748 0.002858', '0.000167', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002655', '0.002573', '0.002562', '0.002562 0.002573 0.002831', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.003568', '0.001525', '0.001289', '0.001289 0.001525 0.007889', '0.003744', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002386', '0.002450', '0.002212', '0.002212 0.002450 0.002496', '0.000153', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002720', '0.002469', '0.002258', '0.002258 0.002469 0.003434', '0.000627', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005567', '0.005588', '0.005516', '0.005516 0.005588 0.005596', '0.000044', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005789', '0.005802', '0.005761', '0.005761 0.005802 0.005805', '0.000024', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.007185', '0.006574', '0.005956', '0.005956 0.006574 0.009027', '0.001625', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005608', '0.005603', '0.005570', '0.005570 0.005603 0.005650', '0.000040', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004917', '0.004927', '0.004804', '0.004804 0.004927 0.005020', '0.000108', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006828', '0.006548', '0.006255', '0.006255 0.006548 0.007682', '0.000754', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005281', '0.005331', '0.005095', '0.005095 0.005331 0.005416', '0.000166', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006040', '0.006071', '0.005787', '0.005787 0.006071 0.006261', '0.000239', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_LocalMem_fp32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_fp32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000213', '0.000200', '0.000198', '0.000198 0.000200 0.000243', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_LocalMem_int32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_int32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000244', '0.000229', '0.000210', '0.000210 0.000229 0.000292', '0.000043', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_L2_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000025', '0.000024', '0.000024 0.000025 0.000047', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000039', '0.000027', '0.000027', '0.000027 0.000027 0.000063', '0.000021', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000051', '0.000033', '0.000026', '0.000026 0.000033 0.000093', '0.000036', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000026', '0.000024', '0.000024 0.000026 0.000048', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000026', '0.000025', '0.000025 0.000026 0.000046', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000025', '0.000025', '0.000025 0.000025 0.000040', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000026', '0.000024', '0.000024 0.000026 0.000042', '0.000010', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000029', '0.000025', '0.000024', '0.000024 0.000025 0.000039', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000026', '0.000025', '0.000025 0.000026 0.000041', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000026', '0.000025', '0.000025 0.000026 0.000043', '0.000010', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000061', '0.000052', '0.000048', '0.000048 0.000052 0.000083', '0.000019', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000090', '0.000076', '0.000058', '0.000058 0.000076 0.000136', '0.000041', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000035', '0.000026', '0.000023', '0.000023 0.000026 0.000056', '0.000018', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000058', '0.000051', '0.000050', '0.000050 0.000051 0.000074', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000059', '0.000050', '0.000050', '0.000050 0.000050 0.000076', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000063', '0.000052', '0.000052', '0.000052 0.000052 0.000087', '0.000020', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000124', '0.000100', '0.000089', '0.000089 0.000100 0.000184', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000094', '0.000059', '0.000057', '0.000057 0.000059 0.000164', '0.000061', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000082', '0.000063', '0.000061', '0.000061 0.000063 0.000123', '0.000035', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000083', '0.000062', '0.000060', '0.000060 0.000062 0.000126', '0.000038', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000211', '0.000150', '0.000125', '0.000125 0.000150 0.000357', '0.000128', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000057', '0.000040', '0.000038', '0.000038 0.000040 0.000094', '0.000032', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000030', '0.000029', '0.000029 0.000030 0.000037', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000029', '0.000028', '0.000028 0.000029 0.000035', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000018', '0.000014', '0.000013', '0.000013 0.000014 0.000026', '0.000007', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000030', '0.000029', '0.000029 0.000030 0.000039', '0.000005', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000023', '0.000016', '0.000015', '0.000015 0.000016 0.000040', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000073', '0.000045', '0.000031', '0.000031 0.000045 0.000144', '0.000062', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000028', '0.000027', '0.000027 0.000028 0.000034', '0.000003', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000036', '0.000027', '0.000025', '0.000025 0.000027 0.000056', '0.000017', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.082756', '0.070855', '0.070644', '0.070644 0.070855 0.106770', '0.020797', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.045856', '0.046829', '0.043549', '0.043549 0.046829 0.047190', '0.002006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.070019', '0.068472', '0.068085', '0.068085 0.068472 0.073499', '0.003020', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.032507', '0.033718', '0.029613', '0.029613 0.033718 0.034190', '0.002517', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_shared

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_shared', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000147', '0.000118', '0.000087', '0.000087 0.000118 0.000235', '0.000078', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_host

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_host', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000003', '0.000002', '0.000002', '0.000002 0.000002 0.000004', '0.000001', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_device

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_device', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000039', '0.000008', '0.000002', '0.000002 0.000008 0.000108', '0.000060', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.003148', '0.003111', '0.003108', '0.003108 0.003111 0.003226', '0.000068', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.013752', '0.013752', '0.013752', '0.013752', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001795', '0.001793', '0.001766', '0.001766 0.001793 0.001826', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015334', '0.015334', '0.015334', '0.015334', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.014174', '0.014174', '0.014174', '0.014174', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015483', '0.015483', '0.015483', '0.015483', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001890', '0.001877', '0.001869', '0.001869 0.001877 0.001924', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.004768', '0.003278', '0.003229', '0.003229 0.003278 0.007798', '0.002624', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000486', '0.000438', '0.000162', '0.000162 0.000438 0.000858', '0.000351', '0.070616', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000046', '0.000011', '0.000009', '0.000009 0.000011 0.000118', '0.000062', '1.222137', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000109', '0.000015', '0.000009', '0.000009 0.000015 0.000303', '0.000168', '1.273830', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000027', '0.000019', '0.000018', '0.000018 0.000019 0.000045', '0.000015', '0.644847', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

VectorAddition_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000037', '0.000033', '0.000030', '0.000030 0.000033 0.000050', '0.000011', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000051', '0.000043', '0.000040', '0.000040 0.000043 0.000069', '0.000016', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000109', '0.000038', '0.000030', '0.000030 0.000038 0.000259', '0.000130', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2DConvolution

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2DConvolution --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2DConvolution.csv

Output:

['Polybench_2DConvolution', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000235', '0.000229', '0.000213', '0.000213 0.000229 0.000262', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2mm.csv --size=512

Output:

['Polybench_2mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001250', '0.001239', '0.001235', '0.001235 0.001239 0.001276', '0.000022', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_3mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/3mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/3mm.csv --size=512

Output:

['Polybench_3mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001748', '0.001747', '0.001737', '0.001737 0.001747 0.001759', '0.000011', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_Arith_fp32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_fp32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000037', '0.000032', '0.000029', '0.000029 0.000032 0.000051', '0.000011', '1067.646054', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

MicroBench_Arith_int32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_int32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000151', '0.000073', '0.000060', '0.000060 0.000073 0.000321', '0.000147', '521.485190', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

Polybench_Atax

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atax --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Atax.csv --size=8192

Output:

['Polybench_Atax', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.006896', '0.006904', '0.006877', '0.006877 0.006904 0.006907', '0.000017', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000047', '0.000040', '0.000039', '0.000039 0.000040 0.000062', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000258', '0.000043', '0.000040', '0.000040 0.000043 0.000691', '0.000375', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000078', '0.000042', '0.000037', '0.000037 0.000042 0.000155', '0.000067', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000274', '0.000041', '0.000035', '0.000035 0.000041 0.000748', '0.000410', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Bicg

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/bicg --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Bicg.csv --size=20480

Output:

['Polybench_Bicg', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.005127', '0.005127', '0.005127', '0.005127', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Correlation

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/correlation --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Correlation.csv --size=2048

Output:

['Polybench_Correlation', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.094979', '0.094979', '0.094979', '0.094979', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Covariance

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/covariance --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Covariance.csv --size=2048

Output:

['Polybench_Covariance', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.094478', '0.094478', '0.094478', '0.094478', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gesummv

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gesummv --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gesummv.csv --size=8192

Output:

['Polybench_Gesummv', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.007317', '0.007317', '0.007317', '0.007317', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gramschmidt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gramschmidt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gramschmidt.csv --size=512

Output:

['Polybench_Gramschmidt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.285066', '0.285066', '0.285066', '0.285066', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Kmeans_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/kmeans --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Kmeans.csv --size=700000000

Output:

['Kmeans_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '67108864', '0.001805', '0.001794', '0.001785', '0.001785 0.001794 0.001837', '0.000028', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegressionCoeff_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_coeff --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegressionCoeff.csv --size=1638400000

Output:

['LinearRegressionCoeff_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001526', '0.001352', '0.001075', '0.001075 0.001352 0.002152', '0.000559', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegression_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_error --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegression.csv --size=640000

Output:

['LinearRegression_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000412', '0.000361', '0.000347', '0.000347 0.000361 0.000529', '0.000101', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MatmulChain

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/matmulchain --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MatmulChain.csv --size=2048

Output:

['MatmulChain', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.011038', '0.011038', '0.011038', '0.011038', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MolecularDynamics

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mol_dyn --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MolecularDynamics.csv --size=8196

Output:

['MolecularDynamics', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000081', '0.000066', '0.000055', '0.000055 0.000066 0.000120', '0.000035', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Mvt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mvt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Mvt.csv --size=32767

Output:

['Polybench_Mvt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.003642', '0.003642', '0.003642', '0.003642', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_sf_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/sf --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/sf_16.csv --size=--size=100000000

Output:

['MicroBench_sf_fp32_16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '0', '0.000043', '0.000025', '0.000021', '0.000021 0.000025 0.000082', '0.000034', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

Polybench_Syr2k

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syr2k --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syr2k.csv --size=6144

Output:

['Polybench_Syr2k', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.006341', '0.006341', '0.006341', '0.006341', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Syrk

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syrk --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syrk.csv --size=4096

Output:

['Polybench_Syrk', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.003218', '0.003218', '0.003218', '0.003218', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11055576783

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11055576783
Job status: success. Test status: success.

Summary

Total 106 benchmarks in mean.
Geomean 99.511%.
Improved 14 Regressed 28 (treshold 0.50%)

(result is better)

Performance change in benchmark groups

Relative perf in group api (6): 99.577%
Benchmark This PR baseline Relative perf Change -
api_overhead_benchmark_ur SubmitKernel out of order 29.755000 μs 31.184 μs 104.80% 4.80% +++
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 4.533000 μs 4.600 μs 101.48% 1.48% +
api_overhead_benchmark_ur SubmitKernel in order 28.840 μs 28.750000 μs 99.69% -0.31% .
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 3.659 μs 3.569000 μs 97.54% -2.46% -
api_overhead_benchmark_sycl SubmitKernel out of order 48.956 μs 47.709000 μs 97.45% -2.55% -
api_overhead_benchmark_sycl SubmitKernel in order 48.621 μs 47.033000 μs 96.73% -3.27% --
Relative perf in group memory (4): 97.855%
Benchmark This PR baseline Relative perf Change -
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 462.777000 μs 468.699 μs 101.28% 1.28% +
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 9.864 μs 9.783000 μs 99.18% -0.82% .
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 252.672 μs 249.365000 μs 98.69% -1.31% -
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 1.865 μs 1.725000 μs 92.49% -7.51% ----
Relative perf in group miscellaneous (1): 99.992%
Benchmark This PR baseline Relative perf Change -
miscellaneous_benchmark_sycl VectorSum 862.825 μs 862.755000 μs 99.99% -0.01% .
Relative perf in group Velocity-Bench (6): 96.205%
Benchmark This PR baseline Relative perf Change -
Velocity-Bench Sobel Filter 993.075000 ms 997.159 ms 100.41% 0.41% .
Velocity-Bench Bitcracker 35.489 s 35.469500 s 99.95% -0.05% .
Velocity-Bench QuickSilver 89.470 MMS/CTT 89.680000 MMS/CTT 99.77% -0.23% .
Velocity-Bench Hashtable 203.070 M keys/sec 204.657810 M keys/sec 99.22% -0.78% .
Velocity-Bench CudaSift 288.647 ms 239.448000 ms 82.96% -17.04% ----------
Velocity-Bench Easywave - 453.000000 ms
Relative perf in group Runtime (52): 99.310%
Benchmark This PR baseline Relative perf Change -
Runtime_BlockedTransform_iter_128_blocksize_256 0.156000 ms 0.161 ms 103.21% 3.21% ++
Runtime_BlockedTransform_iter_512_blocksize_256 0.079000 ms 0.081 ms 102.53% 2.53% +
Runtime_BlockedTransform_iter_256_blocksize_256 0.084000 ms 0.085 ms 101.19% 1.19% +
Runtime_BlockedTransform_iter_64_blocksize_256 0.341000 ms 0.341 ms 100.00% 0.00% .
Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor 5.607 ms 5.588000 ms 99.66% -0.34% .
Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor 5.596 ms 5.576000 ms 99.64% -0.36% .
Runtime_IndependentDAGTaskThroughput_BasicParallelFor 5.814 ms 5.792000 ms 99.62% -0.38% .
Runtime_DAGTaskThroughput_NDRangeParallelFor 4.965 ms 4.856000 ms 97.80% -2.20% -
Runtime_DAGTaskThroughput_HierarchicalParallelFor 5.375 ms 5.250000 ms 97.67% -2.33% -
Runtime_DAGTaskThroughput_BasicParallelFor 6.114 ms 5.963000 ms 97.53% -2.47% -
Runtime_DAGTaskThroughput_SingleTask 6.674 ms 6.459000 ms 96.78% -3.22% --
Runtime_IndependentDAGTaskThroughput_SingleTask 6.712 ms 6.467000 ms 96.35% -3.65% --
Runtime_BlockedTransform_iter_512_blocksize_4096 0.370000 ms -
Runtime_BlockedTransform_iter_128_blocksize_65536 2.543000 ms -
Runtime_BlockedTransform_iter_64_blocksize_1024 0.761000 ms -
Runtime_BlockedTransform_iter_256_blocksize_8192 0.517000 ms -
Runtime_BlockedTransform_iter_256_blocksize_16384 2.551000 ms -
Runtime_BlockedTransform_iter_64_blocksize_65536 2.591000 ms -
Runtime_BlockedTransform_iter_256_blocksize_1024 0.516000 ms -
Runtime_BlockedTransform_iter_64_blocksize_4096 0.308000 ms -
Runtime_BlockedTransform_iter_512_blocksize_16384 2.686000 ms -
Runtime_BlockedTransform_iter_512_blocksize_32768 2.421000 ms -
Runtime_BlockedTransform_iter_256_blocksize_262144 2.519000 ms -
Runtime_BlockedTransform_iter_256_blocksize_131072 2.469000 ms -
Runtime_BlockedTransform_iter_512_blocksize_65536 2.608000 ms -
Runtime_BlockedTransform_iter_128_blocksize_4096 0.313000 ms -
Runtime_BlockedTransform_iter_256_blocksize_65536 2.513000 ms -
Runtime_BlockedTransform_iter_512_blocksize_8192 0.514000 ms -
Runtime_BlockedTransform_iter_128_blocksize_8192 0.496000 ms -
Runtime_BlockedTransform_iter_64_blocksize_16384 2.241000 ms -
Runtime_BlockedTransform_iter_512_blocksize_2048 0.406000 ms -
Runtime_BlockedTransform_iter_64_blocksize_2048 0.350000 ms -
Runtime_BlockedTransform_iter_256_blocksize_2048 0.410000 ms -
Runtime_BlockedTransform_iter_256_blocksize_524288 2.578000 ms -
Runtime_BlockedTransform_iter_128_blocksize_16384 2.288000 ms -
Runtime_BlockedTransform_iter_128_blocksize_1024 0.563000 ms -
Runtime_BlockedTransform_iter_64_blocksize_131072 2.519000 ms -
Runtime_BlockedTransform_iter_64_blocksize_32768 2.421000 ms -
Runtime_BlockedTransform_iter_128_blocksize_262144 2.469000 ms -
Runtime_BlockedTransform_iter_256_blocksize_32768 2.364000 ms -
Runtime_BlockedTransform_iter_128_blocksize_131072 2.424000 ms -
Runtime_BlockedTransform_iter_512_blocksize_524288 2.748000 ms -
Runtime_BlockedTransform_iter_64_blocksize_262144 2.367000 ms -
Runtime_BlockedTransform_iter_128_blocksize_32768 2.450000 ms -
Runtime_BlockedTransform_iter_512_blocksize_131072 2.559000 ms -
Runtime_BlockedTransform_iter_512_blocksize_262144 2.580000 ms -
Runtime_BlockedTransform_iter_512_blocksize_1024 0.518000 ms -
Runtime_BlockedTransform_iter_128_blocksize_2048 0.354000 ms -
Runtime_BlockedTransform_iter_256_blocksize_4096 0.367000 ms -
Runtime_BlockedTransform_iter_64_blocksize_8192 0.456000 ms -
Runtime_BlockedTransform_iter_64_blocksize_524288 2.492000 ms -
Runtime_BlockedTransform_iter_128_blocksize_524288 2.573000 ms -
Relative perf in group MicroBench (15): 100.262%
Benchmark This PR baseline Relative perf Change -
MicroBench_sf_fp32_16 0.025000 ms 0.026 ms 104.00% 4.00% ++
MicroBench_LocalMem_fp32_4096 0.200000 ms 0.200 ms 100.00% 0.00% .
MicroBench_LocalMem_int32_4096 0.229000 ms 0.229 ms 100.00% 0.00% .
MicroBench_L2_int32_2 0.027000 ms 0.027 ms 100.00% 0.00% .
MicroBench_L2_fp32_1 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_fp32_4 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_4 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_1 0.033000 ms 0.033 ms 100.00% 0.00% .
MicroBench_L2_fp32_8 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_fp32_16 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_fp32_2 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_16 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_8 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_Arith_int32_512 0.073000 ms 0.073 ms 100.00% 0.00% .
MicroBench_Arith_fp32_512 0.032000 ms 0.032 ms 100.00% 0.00% .
Relative perf in group Pattern (14): 100.169%
Benchmark This PR baseline Relative perf Change -
Pattern_Reduction_NDRange_fp32 0.025000 ms 0.026 ms 104.00% 4.00% ++
Pattern_Reduction_Hierarchical_int64 0.050000 ms 0.051 ms 102.00% 2.00% +
Pattern_Reduction_Hierarchical_fp32 0.051000 ms 0.052 ms 101.96% 1.96% +
Pattern_Reduction_Hierarchical_int32 0.052000 ms 0.052 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int64 0.016000 ms 0.016 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int16 0.030000 ms 0.030 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int32 0.027000 ms 0.027 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int64 0.029000 ms 0.029 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_fp32 0.014000 ms 0.014 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int32 0.028000 ms 0.028 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_fp32 0.030000 ms 0.030 ms 100.00% 0.00% .
Pattern_Reduction_NDRange_int32 0.076 ms 0.075000 ms 98.68% -1.32% -
Pattern_Reduction_NDRange_int64 0.053 ms 0.052000 ms 98.11% -1.89% -
Pattern_SegmentedReduction_NDRange_int16 0.045 ms 0.044000 ms 97.78% -2.22% -
Relative perf in group ScalarProduct (6): 99.513%
Benchmark This PR baseline Relative perf Change -
ScalarProduct_NDRange_int32 0.150000 ms 0.151 ms 100.67% 0.67% .
ScalarProduct_NDRange_fp32 0.040000 ms 0.040 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_fp32 0.059000 ms 0.059 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int32 0.062000 ms 0.062 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int64 0.064 ms 0.063000 ms 98.44% -1.56% -
ScalarProduct_NDRange_int64 0.100 ms 0.098000 ms 98.00% -2.00% -
Relative perf in group SYCL2020 (2): 100.102%
Benchmark This PR baseline Relative perf Change -
SYCL2020_Accessors_Latency_fp32_in_order__ 68.472000 ms 68.626 ms 100.22% 0.22% .
SYCL2020_Accessors_Latency_fp32_out_of_order__ 70.855 ms 70.840000 ms 99.98% -0.02% .
Relative perf in group USM (17): 99.952%
Benchmark This PR baseline Relative perf Change -
USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1 0.445000 ms 0.460 ms 103.37% 3.37% ++
USM_Allocation_latency_fp32_shared 0.118000 ms 0.120 ms 101.69% 1.69% +
USM_Allocation_latency_fp32_device 0.008000 ms 0.008 ms 100.00% 0.00% .
USM_Allocation_latency_fp32_host 0.002000 ms 0.002 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1 0.015000 ms 0.015 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1 0.011000 ms 0.011 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1 0.019000 ms 0.019 ms 100.00% 0.00% .
USM_Latency_fp32_in_order__ 33.724 ms 33.718000 ms 99.98% -0.02% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch 15.341 ms 15.307000 ms 99.78% -0.22% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch 14.177 ms 14.144000 ms 99.77% -0.23% .
USM_Latency_fp32_out_of_order__ 46.829 ms 46.684000 ms 99.69% -0.31% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch 13.774 ms 13.720000 ms 99.61% -0.39% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch 15.494 ms 15.415000 ms 99.49% -0.51% .
USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch 1.882 ms 1.868000 ms 99.26% -0.74% .
USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch 1.793 ms 1.778000 ms 99.16% -0.84% .
USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch 3.117 ms 3.090000 ms 99.13% -0.87% -
USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch 3.279 ms 3.225000 ms 98.35% -1.65% -
Relative perf in group VectorAddition (3): 99.115%
Benchmark This PR baseline Relative perf Change -
VectorAddition_fp32 0.033000 ms 0.033 ms 100.00% 0.00% .
VectorAddition_int64 0.043000 ms 0.043 ms 100.00% 0.00% .
VectorAddition_int32 0.038 ms 0.037000 ms 97.37% -2.63% --
Relative perf in group Polybench (13): 99.751%
Benchmark This PR baseline Relative perf Change -
Polybench_Gramschmidt 285.063000 ms 285.080 ms 100.01% 0.01% .
Polybench_2DConvolution 0.229000 ms 0.229 ms 100.00% 0.00% .
Polybench_2mm 1.239000 ms 1.239 ms 100.00% 0.00% .
Polybench_Atax 6.903000 ms 6.903 ms 100.00% 0.00% .
Polybench_3mm 1.746 ms 1.745000 ms 99.94% -0.06% .
Polybench_Bicg 5.129 ms 5.126000 ms 99.94% -0.06% .
Polybench_Gesummv 7.317 ms 7.303000 ms 99.81% -0.19% .
Polybench_Covariance 94.518 ms 94.253000 ms 99.72% -0.28% .
Polybench_Mvt 3.645 ms 3.633000 ms 99.67% -0.33% .
Polybench_Syrk 3.221 ms 3.206000 ms 99.53% -0.47% .
Polybench_Syr2k 6.351 ms 6.302000 ms 99.23% -0.77% .
Polybench_Correlation 95.115 ms 94.324000 ms 99.17% -0.83% .
Polybench_Gemm - 3.965000 ms
Relative perf in group ReductionAtomic (4): 100.619%
Benchmark This PR baseline Relative perf Change -
ReductionAtomic_fp32 0.040000 ms 0.041 ms 102.50% 2.50% +
ReductionAtomic_int32 0.042000 ms 0.042 ms 100.00% 0.00% .
ReductionAtomic_int64 0.041000 ms 0.041 ms 100.00% 0.00% .
ReductionAtomic_fp64 0.043000 ms 0.043 ms 100.00% 0.00% .
Relative perf in group Kmeans (1): 99.889%
Benchmark This PR baseline Relative perf Change -
Kmeans_fp32 1.794 ms 1.792000 ms 99.89% -0.11% .
Relative perf in group LinearRegressionCoeff (1): 89.501%
Benchmark This PR baseline Relative perf Change -
LinearRegressionCoeff_fp32 1.362 ms 1.219000 ms 89.50% -10.50% ------
Relative perf in group LinearRegression (1): 98.619%
Benchmark This PR baseline Relative perf Change -
LinearRegression_fp32 0.362 ms 0.357000 ms 98.62% -1.38% -
Relative perf in group MatmulChain (1): 99.909%
Benchmark This PR baseline Relative perf Change -
MatmulChain 11.039 ms 11.029000 ms 99.91% -0.09% .
Relative perf in group MolecularDynamics (1): 100.000%
Benchmark This PR baseline Relative perf Change -
MolecularDynamics 0.066000 ms 0.066 ms 100.00% 0.00% .

Details

Benchmark details - environment, command, output...
api_overhead_benchmark_sycl SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.956,48.355,7.80%,46.135,534.178,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.621,48.124,7.19%,46.114,508.167,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),29.755,31.871,21.02%,13.869,622.785,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),28.840,28.546,8.26%,26.930,461.689,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),462.777,456.755,6.43%,446.085,927.806,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),252.672,225.678,25.02%,219.218,701.075,[CPU],[us]

memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),9.864,9.697,18.65%,7.555,119.044,[CPU],[us]

memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),1.865,1.900,7.26%,0.262,2.018,[CPU],[GB/s]

api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),4.533,4.467,9.33%,4.022,78.148,[CPU],[us]

api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),3.659,3.515,16.94%,3.306,115.251,[CPU],[us]

miscellaneous_benchmark_sycl VectorSum

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),862.825,863.026,0.55%,825.109,886.433,[GPU],bw [GB/s]

Velocity-Bench Hashtable

Environment Variables:

Command:

/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.660944 s
203.069688 million keys/second

Velocity-Bench Bitcracker

Environment Variables:

Command:

/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.0147261 s
bitcracker - total time for whole calculation: 35.4889 s

Velocity-Bench CudaSift

Environment Variables:

Command:

/home/test-user/bench_workdir/cudaSift/cudaSift

Output:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1066 1284 28.9438% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1235 1269 33.5324% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1258 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1049 1260 28.4822% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1263 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1104 1261 29.9756% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1239 1274 33.6411% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1235 1273 33.5324% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1268 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1142 1273 31.0073% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1264 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1210 1253 32.8537% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1240 1276 33.6682% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1133 1271 30.763% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1262 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1081 1261 29.3511% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1271 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1263 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1262 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1270 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1268 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1256 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1122 1256 30.4643% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1216 1261 33.0166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1218 1253 33.0709% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1267 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1084 1259 29.4325% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1226 1263 33.2881% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1093 1261 29.6769% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1265 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1223 1257 33.2066% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1129 1254 30.6544% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1262 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1097 1262 29.7855% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1101 1259 29.8941% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1237 1272 33.5868% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1265 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1258 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1134 1266 30.7901% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1266 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1158 1258 31.4418% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1264 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1220 1254 33.1252% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1235 1270 33.5324% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1068 1263 28.9981% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1266 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1263 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1216 1259 33.0166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1237 1272 33.5868% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1094 1267 29.704% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 288.647 ms

Velocity-Bench QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 7.281130e-01 8.490000e-01 1.000000e-06
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 5.519500e-01 9.875140e-01 1.000000e-06
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 5.306290e-01 1.000135e+00 1.000000e-06
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 5.835840e-01 1.109676e+00 1.000000e-06
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 5.271140e-01 1.047877e+00 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 5.257160e-01 1.001125e+00 1.000000e-06
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 5.321450e-01 9.989930e-01 1.000000e-06
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 5.220790e-01 1.040256e+00 1.000000e-06
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 5.232830e-01 1.039071e+00 0.000000e+00
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 5.240070e-01 9.957410e-01 1.000000e-06

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.562e+07 1.562e+07 1.562e+07 0.000e+00 100.00
cycleInit 10 5.549e+06 5.549e+06 5.549e+06 0.000e+00 100.00
cycleTracking 10 1.007e+07 1.007e+07 1.007e+07 0.000e+00 100.00
cycleTracking_Kernel 104 4.958e+06 4.958e+06 4.958e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.960e+05 2.960e+05 2.960e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 8.170e+02 8.170e+02 8.170e+02 0.000e+00 100.00
Figure Of Merit 89.47 [Num Mega Segments / Cycle Tracking Time]

Velocity-Bench Sobel Filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 15.0561 s
sobelfilter - total time for whole calculation: 0.993075 s

Runtime_BlockedTransform_iter_512_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000371', '0.000370', '0.000359', '0.000359 0.000370 0.000384', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002574', '0.002543', '0.002440', '0.002440 0.002543 0.002739', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.002831', '0.000761', '0.000528', '0.000528 0.000761 0.007205', '0.003790', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000526', '0.000517', '0.000446', '0.000446 0.000517 0.000616', '0.000085', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002923', '0.002551', '0.002523', '0.002523 0.002551 0.003697', '0.000670', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002824', '0.002591', '0.002512', '0.002512 0.002591 0.003370', '0.000474', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000207', '0.000156', '0.000122', '0.000122 0.000156 0.000345', '0.000120', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000526', '0.000516', '0.000507', '0.000507 0.000516 0.000555', '0.000026', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000328', '0.000308', '0.000303', '0.000303 0.000308 0.000373', '0.000039', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002771', '0.002686', '0.002300', '0.002300 0.002686 0.003328', '0.000519', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002499', '0.002421', '0.002141', '0.002141 0.002421 0.002934', '0.000402', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002657', '0.002519', '0.002191', '0.002191 0.002519 0.003261', '0.000548', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002720', '0.002469', '0.002258', '0.002258 0.002469 0.003434', '0.000627', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002577', '0.002608', '0.002157', '0.002157 0.002608 0.002967', '0.000406', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000324', '0.000313', '0.000307', '0.000307 0.000313 0.000353', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002559', '0.002513', '0.002472', '0.002472 0.002513 0.002691', '0.000116', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000518', '0.000514', '0.000513', '0.000513 0.000514 0.000527', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000517', '0.000496', '0.000479', '0.000479 0.000496 0.000577', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000095', '0.000084', '0.000077', '0.000077 0.000084 0.000123', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002506', '0.002241', '0.002218', '0.002218 0.002241 0.003058', '0.000479', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000417', '0.000406', '0.000404', '0.000404 0.000406 0.000441', '0.000021', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.002352', '0.000341', '0.000186', '0.000186 0.000341 0.006528', '0.003618', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000374', '0.000350', '0.000347', '0.000347 0.000350 0.000425', '0.000045', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000403', '0.000410', '0.000360', '0.000360 0.000410 0.000440', '0.000041', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000116', '0.000079', '0.000076', '0.000076 0.000079 0.000192', '0.000066', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002673', '0.002578', '0.002186', '0.002186 0.002578 0.003254', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002376', '0.002288', '0.002252', '0.002252 0.002288 0.002587', '0.000184', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000581', '0.000563', '0.000527', '0.000527 0.000563 0.000652', '0.000065', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002705', '0.002519', '0.002329', '0.002329 0.002519 0.003268', '0.000496', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002445', '0.002421', '0.002410', '0.002410 0.002421 0.002506', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002465', '0.002469', '0.002358', '0.002358 0.002469 0.002569', '0.000105', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002597', '0.002364', '0.002245', '0.002245 0.002364 0.003182', '0.000510', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002449', '0.002424', '0.002306', '0.002306 0.002424 0.002617', '0.000157', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002712', '0.002748', '0.002529', '0.002529 0.002748 0.002858', '0.000167', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002654', '0.002367', '0.002305', '0.002305 0.002367 0.003289', '0.000551', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002386', '0.002450', '0.002212', '0.002212 0.002450 0.002496', '0.000153', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002730', '0.002559', '0.002296', '0.002296 0.002559 0.003335', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002683', '0.002580', '0.002427', '0.002427 0.002580 0.003041', '0.000320', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000512', '0.000518', '0.000497', '0.000497 0.000518 0.000520', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000366', '0.000354', '0.000349', '0.000349 0.000354 0.000396', '0.000026', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000354', '0.000367', '0.000328', '0.000328 0.000367 0.000369', '0.000023', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '65536', '0.000659', '0.000456', '0.000446', '0.000446 0.000456 0.001076', '0.000361', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002515', '0.002492', '0.002465', '0.002465 0.002492 0.002589', '0.000065', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002655', '0.002573', '0.002562', '0.002562 0.002573 0.002831', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005593', '0.005596', '0.005556', '0.005556 0.005596 0.005626', '0.000035', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005609', '0.005607', '0.005568', '0.005568 0.005607 0.005651', '0.000042', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.007225', '0.006712', '0.005970', '0.005970 0.006712 0.008994', '0.001576', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005815', '0.005814', '0.005801', '0.005801 0.005814 0.005830', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006895', '0.006674', '0.006379', '0.006379 0.006674 0.007632', '0.000655', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004916', '0.004965', '0.004780', '0.004780 0.004965 0.005005', '0.000120', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006074', '0.006114', '0.005818', '0.005818 0.006114 0.006291', '0.000239', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005329', '0.005375', '0.005121', '0.005121 0.005375 0.005491', '0.000190', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_LocalMem_fp32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_fp32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000213', '0.000200', '0.000198', '0.000198 0.000200 0.000243', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_LocalMem_int32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_int32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000243', '0.000229', '0.000210', '0.000210 0.000229 0.000291', '0.000042', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_L2_int32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000034', '0.000027', '0.000026', '0.000026 0.000027 0.000049', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000029', '0.000025', '0.000023', '0.000023 0.000025 0.000040', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000026', '0.000024', '0.000024 0.000026 0.000040', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000026', '0.000025', '0.000025 0.000026 0.000041', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000052', '0.000033', '0.000026', '0.000026 0.000033 0.000097', '0.000039', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000029', '0.000025', '0.000024', '0.000024 0.000025 0.000038', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000025', '0.000025', '0.000025 0.000025 0.000049', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000026', '0.000024', '0.000024 0.000026 0.000046', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000026', '0.000025', '0.000025 0.000026 0.000048', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000026', '0.000025', '0.000025 0.000026 0.000040', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000093', '0.000076', '0.000060', '0.000060 0.000076 0.000144', '0.000045', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000113', '0.000051', '0.000050', '0.000050 0.000051 0.000238', '0.000108', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000094', '0.000052', '0.000051', '0.000051 0.000052 0.000179', '0.000074', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000184', '0.000050', '0.000045', '0.000045 0.000050 0.000456', '0.000236', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000061', '0.000053', '0.000047', '0.000047 0.000053 0.000081', '0.000018', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000036', '0.000025', '0.000022', '0.000022 0.000025 0.000062', '0.000022', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000080', '0.000064', '0.000061', '0.000061 0.000064 0.000115', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000054', '0.000040', '0.000038', '0.000038 0.000040 0.000086', '0.000027', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000125', '0.000100', '0.000090', '0.000090 0.000100 0.000185', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000077', '0.000059', '0.000057', '0.000057 0.000059 0.000114', '0.000032', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000201', '0.000150', '0.000125', '0.000125 0.000150 0.000328', '0.000111', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000083', '0.000062', '0.000060', '0.000060 0.000062 0.000126', '0.000037', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000023', '0.000016', '0.000015', '0.000015 0.000016 0.000038', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000030', '0.000029', '0.000029 0.000030 0.000038', '0.000005', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000034', '0.000027', '0.000025', '0.000025 0.000027 0.000052', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000029', '0.000028', '0.000028 0.000029 0.000036', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000018', '0.000014', '0.000013', '0.000013 0.000014 0.000026', '0.000007', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000028', '0.000028', '0.000028 0.000028 0.000036', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000106', '0.000045', '0.000031', '0.000031 0.000045 0.000241', '0.000117', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000030', '0.000027', '0.000027 0.000030 0.000039', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.082756', '0.070855', '0.070644', '0.070644 0.070855 0.106770', '0.020797', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.032375', '0.033724', '0.029306', '0.029306 0.033724 0.034094', '0.002664', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.045856', '0.046829', '0.043549', '0.043549 0.046829 0.047190', '0.002006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.070019', '0.068472', '0.068085', '0.068085 0.068472 0.073499', '0.003020', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_device

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_device', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000041', '0.000008', '0.000002', '0.000002 0.000008 0.000114', '0.000063', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_shared

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_shared', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000147', '0.000118', '0.000087', '0.000087 0.000118 0.000235', '0.000078', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_host

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_host', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000002', '0.000002', '0.000001', '0.000001 0.000002 0.000003', '0.000001', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.003148', '0.003117', '0.003050', '0.003050 0.003117 0.003277', '0.000117', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015341', '0.015341', '0.015341', '0.015341', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.004681', '0.003279', '0.003090', '0.003090 0.003279 0.007673', '0.002593', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001881', '0.001882', '0.001871', '0.001871 0.001882 0.001889', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015494', '0.015494', '0.015494', '0.015494', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001795', '0.001793', '0.001766', '0.001766 0.001793 0.001826', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.014177', '0.014177', '0.014177', '0.014177', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.013774', '0.013774', '0.013774', '0.013774', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000109', '0.000015', '0.000009', '0.000009 0.000015 0.000303', '0.000168', '1.273830', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000040', '0.000011', '0.000009', '0.000009 0.000011 0.000099', '0.000051', '1.216292', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000535', '0.000445', '0.000202', '0.000202 0.000445 0.000959', '0.000387', '0.056721', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000027', '0.000019', '0.000018', '0.000018 0.000019 0.000044', '0.000015', '0.649900', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

VectorAddition_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000109', '0.000038', '0.000030', '0.000030 0.000038 0.000259', '0.000130', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000039', '0.000033', '0.000029', '0.000029 0.000033 0.000054', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000051', '0.000043', '0.000040', '0.000040 0.000043 0.000069', '0.000016', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2DConvolution

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2DConvolution --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2DConvolution.csv

Output:

['Polybench_2DConvolution', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000236', '0.000229', '0.000227', '0.000227 0.000229 0.000253', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2mm.csv --size=512

Output:

['Polybench_2mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001249', '0.001239', '0.001233', '0.001233 0.001239 0.001274', '0.000022', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_3mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/3mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/3mm.csv --size=512

Output:

['Polybench_3mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001747', '0.001746', '0.001739', '0.001739 0.001746 0.001756', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_Arith_int32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_int32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000147', '0.000073', '0.000058', '0.000058 0.000073 0.000309', '0.000141', '538.718797', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

MicroBench_Arith_fp32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_fp32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000037', '0.000032', '0.000029', '0.000029 0.000032 0.000049', '0.000011', '1070.462097', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

Polybench_Atax

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atax --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Atax.csv --size=8192

Output:

['Polybench_Atax', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.006839', '0.006903', '0.006702', '0.006702 0.006903 0.006913', '0.000119', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000076', '0.000042', '0.000036', '0.000036 0.000042 0.000149', '0.000063', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000047', '0.000040', '0.000039', '0.000039 0.000040 0.000062', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000273', '0.000041', '0.000036', '0.000036 0.000041 0.000742', '0.000406', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000252', '0.000043', '0.000039', '0.000039 0.000043 0.000674', '0.000366', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Bicg

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/bicg --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Bicg.csv --size=20480

Output:

['Polybench_Bicg', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.005129', '0.005129', '0.005129', '0.005129', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Correlation

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/correlation --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Correlation.csv --size=2048

Output:

['Polybench_Correlation', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.095115', '0.095115', '0.095115', '0.095115', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Covariance

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/covariance --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Covariance.csv --size=2048

Output:

['Polybench_Covariance', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.094518', '0.094518', '0.094518', '0.094518', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gesummv

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gesummv --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gesummv.csv --size=8192

Output:

['Polybench_Gesummv', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.007317', '0.007317', '0.007317', '0.007317', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gramschmidt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gramschmidt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gramschmidt.csv --size=512

Output:

['Polybench_Gramschmidt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.285063', '0.285063', '0.285063', '0.285063', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Kmeans_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/kmeans --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Kmeans.csv --size=700000000

Output:

['Kmeans_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '67108864', '0.001805', '0.001794', '0.001785', '0.001785 0.001794 0.001837', '0.000028', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegressionCoeff_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_coeff --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegressionCoeff.csv --size=1638400000

Output:

['LinearRegressionCoeff_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001504', '0.001362', '0.001182', '0.001182 0.001362 0.001967', '0.000411', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegression_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_error --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegression.csv --size=640000

Output:

['LinearRegression_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000388', '0.000362', '0.000344', '0.000344 0.000362 0.000459', '0.000062', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MatmulChain

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/matmulchain --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MatmulChain.csv --size=2048

Output:

['MatmulChain', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.011039', '0.011039', '0.011039', '0.011039', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MolecularDynamics

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mol_dyn --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MolecularDynamics.csv --size=8196

Output:

['MolecularDynamics', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000081', '0.000066', '0.000056', '0.000056 0.000066 0.000121', '0.000035', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Mvt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mvt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Mvt.csv --size=32767

Output:

['Polybench_Mvt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.003645', '0.003645', '0.003645', '0.003645', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_sf_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/sf --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/sf_16.csv --size=--size=100000000

Output:

['MicroBench_sf_fp32_16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '0', '0.000043', '0.000025', '0.000021', '0.000021 0.000025 0.000082', '0.000034', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

Polybench_Syr2k

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syr2k --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syr2k.csv --size=6144

Output:

['Polybench_Syr2k', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.006351', '0.006351', '0.006351', '0.006351', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Syrk

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syrk --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syrk.csv --size=4096

Output:

['Polybench_Syrk', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.003221', '0.003221', '0.003221', '0.003221', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

@mateuszpn mateuszpn marked this pull request as ready for review September 27, 2024 11:09
@mateuszpn mateuszpn requested a review from a team as a code owner September 27, 2024 11:09
Copy link
Contributor

@pbalcer pbalcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a few nits.

.github/workflows/bench_publish.yml Outdated Show resolved Hide resolved
scripts/benchmarks/benches/compute.py Outdated Show resolved Hide resolved
scripts/benchmarks/benches/syclbench.py Outdated Show resolved Hide resolved
scripts/benchmarks/output.py Outdated Show resolved Hide resolved
scripts/benchmarks/output.py Outdated Show resolved Hide resolved
scripts/benchmarks/output.py Outdated Show resolved Hide resolved
Copy link

Compute Benchmarks level_zero run (with params: --epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11105881038

Copy link

Compute Benchmarks level_zero run (--epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11105881038
Job status: failure. Test status: skipped.

Copy link

Compute Benchmarks level_zero run (with params: --epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106081894

Copy link

Compute Benchmarks level_zero run (--epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106081894
Job status: failure. Test status: skipped.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106240633

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11106240633
Job status: failure. Test status: skipped.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106252727

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11106252727
Job status: failure. Test status: skipped.

Copy link

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106484753

Copy link

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11106484753
Job status: failure. Test status: skipped.

Copy link

Compute Benchmarks level_zero run (with params: --epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106764764

Copy link

Compute Benchmarks level_zero run (--epsilon 0.001):
https://github.com/oneapi-src/unified-runtime/actions/runs/11106764764
Job status: success. Test status: success.

Summary

Total 106 benchmarks in mean.
Geomean 106.793%.
Improved 26 Regressed 34 (threshold 0.10%)

(result is better)

Performance change in benchmark groups

Relative perf in group api (6): 205.944%
Benchmark This PR baseline Relative perf Change -
api_overhead_benchmark_ur SubmitKernel in order 13.290000 μs 28.750 μs 216.33% 116.33% ++++++++++
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 2.148000 μs 4.600 μs 214.15% 114.15% ++++++++++
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 1.669000 μs 3.569 μs 213.84% 113.84% ++++++++++
api_overhead_benchmark_sycl SubmitKernel out of order 22.506000 μs 47.709 μs 211.98% 111.98% ++++++++++
api_overhead_benchmark_sycl SubmitKernel in order 23.044000 μs 47.033 μs 204.10% 104.10% +++++++++
api_overhead_benchmark_ur SubmitKernel out of order 17.519000 μs 31.184 μs 178.00% 78.00% +++++++
Relative perf in group memory (4): 140.377%
Benchmark This PR baseline Relative perf Change -
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 117.904000 μs 249.365 μs 211.50% 111.50% ++++++++++
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 229.461000 μs 468.699 μs 204.26% 104.26% +++++++++
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 5.917000 μs 9.783 μs 165.34% 65.34% ++++++
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 3.173 μs 1.725000 μs 54.36% -45.64% ----
Relative perf in group miscellaneous (1): 100.575%
Benchmark This PR baseline Relative perf Change -
miscellaneous_benchmark_sycl VectorSum 857.826000 μs 862.755 μs 100.57% 0.57% .
Relative perf in group Velocity-Bench (6): 134.828%
Benchmark This PR baseline Relative perf Change -
Velocity-Bench Sobel Filter 550.363000 ms 997.159 ms 181.18% 81.18% +++++++
Velocity-Bench Hashtable 357.770058 M keys/sec 204.658 M keys/sec 174.81% 74.81% ++++++
Velocity-Bench QuickSilver 117.060000 MMS/CTT 89.680 MMS/CTT 130.53% 30.53% +++
Velocity-Bench CudaSift 221.342000 ms 239.448 ms 108.18% 8.18% +
Velocity-Bench Bitcracker 35.605 s 35.469500 s 99.62% -0.38% .
Velocity-Bench Easywave - 453.000000 ms
Relative perf in group Runtime (52): 98.935%
Benchmark This PR baseline Relative perf Change -
Runtime_BlockedTransform_iter_128_blocksize_256 0.156000 ms 0.161 ms 103.21% 3.21% .
Runtime_BlockedTransform_iter_512_blocksize_256 0.079000 ms 0.081 ms 102.53% 2.53% .
Runtime_BlockedTransform_iter_256_blocksize_256 0.084000 ms 0.085 ms 101.19% 1.19% .
Runtime_BlockedTransform_iter_64_blocksize_256 0.341000 ms 0.341 ms 100.00% 0.00% .
Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor 5.610 ms 5.588000 ms 99.61% -0.39% .
Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor 5.600 ms 5.576000 ms 99.57% -0.43% .
Runtime_IndependentDAGTaskThroughput_BasicParallelFor 5.818 ms 5.792000 ms 99.55% -0.45% .
Runtime_DAGTaskThroughput_HierarchicalParallelFor 5.390 ms 5.250000 ms 97.40% -2.60% .
Runtime_DAGTaskThroughput_NDRangeParallelFor 4.989 ms 4.856000 ms 97.33% -2.67% .
Runtime_DAGTaskThroughput_BasicParallelFor 6.160 ms 5.963000 ms 96.80% -3.20% .
Runtime_DAGTaskThroughput_SingleTask 6.716 ms 6.459000 ms 96.17% -3.83% .
Runtime_IndependentDAGTaskThroughput_SingleTask 6.862 ms 6.467000 ms 94.24% -5.76% .
Runtime_BlockedTransform_iter_256_blocksize_8192 0.341000 ms -
Runtime_BlockedTransform_iter_512_blocksize_524288 2.748000 ms -
Runtime_BlockedTransform_iter_128_blocksize_524288 2.573000 ms -
Runtime_BlockedTransform_iter_512_blocksize_1024 0.495000 ms -
Runtime_BlockedTransform_iter_512_blocksize_2048 0.405000 ms -
Runtime_BlockedTransform_iter_512_blocksize_16384 2.686000 ms -
Runtime_BlockedTransform_iter_64_blocksize_65536 2.591000 ms -
Runtime_BlockedTransform_iter_64_blocksize_32768 2.421000 ms -
Runtime_BlockedTransform_iter_128_blocksize_65536 2.543000 ms -
Runtime_BlockedTransform_iter_256_blocksize_1024 0.513000 ms -
Runtime_BlockedTransform_iter_64_blocksize_524288 2.492000 ms -
Runtime_BlockedTransform_iter_128_blocksize_1024 0.556000 ms -
Runtime_BlockedTransform_iter_256_blocksize_4096 0.361000 ms -
Runtime_BlockedTransform_iter_128_blocksize_131072 2.424000 ms -
Runtime_BlockedTransform_iter_128_blocksize_4096 0.273000 ms -
Runtime_BlockedTransform_iter_256_blocksize_65536 2.513000 ms -
Runtime_BlockedTransform_iter_64_blocksize_1024 0.708000 ms -
Runtime_BlockedTransform_iter_512_blocksize_65536 2.608000 ms -
Runtime_BlockedTransform_iter_64_blocksize_8192 0.329000 ms -
Runtime_BlockedTransform_iter_64_blocksize_262144 2.367000 ms -
Runtime_BlockedTransform_iter_128_blocksize_8192 0.289000 ms -
Runtime_BlockedTransform_iter_128_blocksize_16384 2.288000 ms -
Runtime_BlockedTransform_iter_256_blocksize_262144 2.519000 ms -
Runtime_BlockedTransform_iter_64_blocksize_2048 0.349000 ms -
Runtime_BlockedTransform_iter_128_blocksize_32768 2.450000 ms -
Runtime_BlockedTransform_iter_256_blocksize_16384 2.551000 ms -
Runtime_BlockedTransform_iter_512_blocksize_4096 0.362000 ms -
Runtime_BlockedTransform_iter_512_blocksize_262144 2.580000 ms -
Runtime_BlockedTransform_iter_64_blocksize_131072 2.519000 ms -
Runtime_BlockedTransform_iter_64_blocksize_16384 2.241000 ms -
Runtime_BlockedTransform_iter_128_blocksize_262144 2.469000 ms -
Runtime_BlockedTransform_iter_256_blocksize_524288 2.578000 ms -
Runtime_BlockedTransform_iter_256_blocksize_32768 2.364000 ms -
Runtime_BlockedTransform_iter_512_blocksize_32768 2.421000 ms -
Runtime_BlockedTransform_iter_256_blocksize_2048 0.400000 ms -
Runtime_BlockedTransform_iter_64_blocksize_4096 0.306000 ms -
Runtime_BlockedTransform_iter_256_blocksize_131072 2.469000 ms -
Runtime_BlockedTransform_iter_512_blocksize_8192 0.347000 ms -
Runtime_BlockedTransform_iter_512_blocksize_131072 2.559000 ms -
Runtime_BlockedTransform_iter_128_blocksize_2048 0.354000 ms -
Relative perf in group MicroBench (15): 100.262%
Benchmark This PR baseline Relative perf Change -
MicroBench_sf_fp32_16 0.025000 ms 0.026 ms 104.00% 4.00% .
MicroBench_LocalMem_fp32_4096 0.200000 ms 0.200 ms 100.00% 0.00% .
MicroBench_LocalMem_int32_4096 0.229000 ms 0.229 ms 100.00% 0.00% .
MicroBench_L2_int32_1 0.033000 ms 0.033 ms 100.00% 0.00% .
MicroBench_L2_int32_16 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_2 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_4 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_4 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_int32_2 0.027000 ms 0.027 ms 100.00% 0.00% .
MicroBench_L2_fp32_8 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_int32_8 0.026000 ms 0.026 ms 100.00% 0.00% .
MicroBench_L2_fp32_1 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_L2_fp32_16 0.025000 ms 0.025 ms 100.00% 0.00% .
MicroBench_Arith_fp32_512 0.032000 ms 0.032 ms 100.00% 0.00% .
MicroBench_Arith_int32_512 0.073000 ms 0.073 ms 100.00% 0.00% .
Relative perf in group Pattern (14): 100.169%
Benchmark This PR baseline Relative perf Change -
Pattern_Reduction_NDRange_fp32 0.025000 ms 0.026 ms 104.00% 4.00% .
Pattern_Reduction_Hierarchical_int64 0.050000 ms 0.051 ms 102.00% 2.00% .
Pattern_Reduction_Hierarchical_fp32 0.051000 ms 0.052 ms 101.96% 1.96% .
Pattern_Reduction_Hierarchical_int32 0.052000 ms 0.052 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_fp32 0.014000 ms 0.014 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int32 0.028000 ms 0.028 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int16 0.030000 ms 0.030 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_int64 0.029000 ms 0.029 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_fp32 0.030000 ms 0.030 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int32 0.027000 ms 0.027 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int64 0.016000 ms 0.016 ms 100.00% 0.00% .
Pattern_Reduction_NDRange_int32 0.076 ms 0.075000 ms 98.68% -1.32% .
Pattern_Reduction_NDRange_int64 0.053 ms 0.052000 ms 98.11% -1.89% .
Pattern_SegmentedReduction_NDRange_int16 0.045 ms 0.044000 ms 97.78% -2.22% .
Relative perf in group ScalarProduct (6): 99.774%
Benchmark This PR baseline Relative perf Change -
ScalarProduct_NDRange_int32 0.150000 ms 0.151 ms 100.67% 0.67% .
ScalarProduct_NDRange_fp32 0.040000 ms 0.040 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int64 0.063000 ms 0.063 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_fp32 0.059000 ms 0.059 ms 100.00% 0.00% .
ScalarProduct_Hierarchical_int32 0.062000 ms 0.062 ms 100.00% 0.00% .
ScalarProduct_NDRange_int64 0.100 ms 0.098000 ms 98.00% -2.00% .
Relative perf in group SYCL2020 (2): 100.146%
Benchmark This PR baseline Relative perf Change -
SYCL2020_Accessors_Latency_fp32_in_order__ 68.426000 ms 68.626 ms 100.29% 0.29% .
SYCL2020_Accessors_Latency_fp32_out_of_order__ 70.840000 ms 70.840 ms 100.00% 0.00% .
Relative perf in group USM (17): 100.050%
Benchmark This PR baseline Relative perf Change -
USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1 0.438000 ms 0.460 ms 105.02% 5.02% .
USM_Allocation_latency_fp32_shared 0.118000 ms 0.120 ms 101.69% 1.69% .
USM_Latency_fp32_in_order__ 33.718000 ms 33.718 ms 100.00% 0.00% .
USM_Allocation_latency_fp32_host 0.002000 ms 0.002 ms 100.00% 0.00% .
USM_Allocation_latency_fp32_device 0.008000 ms 0.008 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1 0.015000 ms 0.015 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1 0.019000 ms 0.019 ms 100.00% 0.00% .
USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1 0.011000 ms 0.011 ms 100.00% 0.00% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch 15.341 ms 15.307000 ms 99.78% -0.22% .
USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch 14.177 ms 14.144000 ms 99.77% -0.23% .
USM_Latency_fp32_out_of_order__ 46.807 ms 46.684000 ms 99.74% -0.26% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch 13.774 ms 13.720000 ms 99.61% -0.39% .
USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch 15.494 ms 15.415000 ms 99.49% -0.51% .
USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch 1.882 ms 1.868000 ms 99.26% -0.74% .
USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch 1.793 ms 1.778000 ms 99.16% -0.84% .
USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch 3.117 ms 3.090000 ms 99.13% -0.87% .
USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch 3.279 ms 3.225000 ms 98.35% -1.65% .
Relative perf in group VectorAddition (3): 100.000%
Benchmark This PR baseline Relative perf Change -
VectorAddition_fp32 0.033000 ms 0.033 ms 100.00% 0.00% .
VectorAddition_int64 0.043000 ms 0.043 ms 100.00% 0.00% .
VectorAddition_int32 0.037000 ms 0.037 ms 100.00% 0.00% .
Relative perf in group Polybench (13): 99.700%
Benchmark This PR baseline Relative perf Change -
Polybench_Gramschmidt 285.060000 ms 285.080 ms 100.01% 0.01% .
Polybench_2DConvolution 0.229000 ms 0.229 ms 100.00% 0.00% .
Polybench_2mm 1.239000 ms 1.239 ms 100.00% 0.00% .
Polybench_Atax 6.903000 ms 6.903 ms 100.00% 0.00% .
Polybench_3mm 1.746 ms 1.745000 ms 99.94% -0.06% .
Polybench_Bicg 5.132 ms 5.126000 ms 99.88% -0.12% .
Polybench_Gesummv 7.312 ms 7.303000 ms 99.88% -0.12% .
Polybench_Covariance 94.557 ms 94.253000 ms 99.68% -0.32% .
Polybench_Mvt 3.648 ms 3.633000 ms 99.59% -0.41% .
Polybench_Syrk 3.222 ms 3.206000 ms 99.50% -0.50% .
Polybench_Syr2k 6.356 ms 6.302000 ms 99.15% -0.85% .
Polybench_Correlation 95.493 ms 94.324000 ms 98.78% -1.22% .
Polybench_Gemm - 3.965000 ms
Relative perf in group ReductionAtomic (4): 100.619%
Benchmark This PR baseline Relative perf Change -
ReductionAtomic_fp32 0.040000 ms 0.041 ms 102.50% 2.50% .
ReductionAtomic_int32 0.042000 ms 0.042 ms 100.00% 0.00% .
ReductionAtomic_int64 0.041000 ms 0.041 ms 100.00% 0.00% .
ReductionAtomic_fp64 0.043000 ms 0.043 ms 100.00% 0.00% .
Relative perf in group Kmeans (1): 99.833%
Benchmark This PR baseline Relative perf Change -
Kmeans_fp32 1.795 ms 1.792000 ms 99.83% -0.17% .
Relative perf in group LinearRegressionCoeff (1): 88.142%
Benchmark This PR baseline Relative perf Change -
LinearRegressionCoeff_fp32 1.383 ms 1.219000 ms 88.14% -11.86% -
Relative perf in group LinearRegression (1): 98.347%
Benchmark This PR baseline Relative perf Change -
LinearRegression_fp32 0.363 ms 0.357000 ms 98.35% -1.65% .
Relative perf in group MatmulChain (1): 99.873%
Benchmark This PR baseline Relative perf Change -
MatmulChain 11.043 ms 11.029000 ms 99.87% -0.13% .
Relative perf in group MolecularDynamics (1): 100.000%
Benchmark This PR baseline Relative perf Change -
MolecularDynamics 0.066000 ms 0.066 ms 100.00% 0.00% .

Details

Benchmark details - environment, command, output...
api_overhead_benchmark_sycl SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),22.506,22.476,6.63%,21.823,487.914,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),23.044,23.025,3.44%,22.008,252.456,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel out of order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),17.519,17.638,7.39%,13.715,252.070,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel in order

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),13.290,13.279,1.83%,12.651,55.781,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),229.461,229.352,1.54%,225.584,464.503,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),117.904,117.561,1.58%,113.487,177.407,[CPU],[us]

memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),5.917,5.713,18.19%,5.187,100.320,[CPU],[us]

memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),3.173,3.182,3.01%,0.492,3.405,[CPU],[GB/s]

api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),2.148,2.142,4.56%,1.954,9.317,[CPU],[us]

api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),1.669,1.663,4.31%,1.567,10.930,[CPU],[us]

miscellaneous_benchmark_sycl VectorSum

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),857.826,858.316,0.45%,813.112,871.393,[GPU],bw [GB/s]

Velocity-Bench Hashtable

Environment Variables:

Command:

/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.375151 s
357.770058 million keys/second

Velocity-Bench Bitcracker

Environment Variables:

Command:

/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.00417169 s
bitcracker - total time for whole calculation: 35.605 s

Velocity-Bench CudaSift

Environment Variables:

Command:

/home/test-user/bench_workdir/cudaSift/cudaSift

Output:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1108 1273 30.0842% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1184 1260 32.1477% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1264 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1223 1258 33.2066% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1213 1249 32.9351% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1272 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1265 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1265 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1257 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1215 1249 32.9894% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1198 1257 32.5278% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1266 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1268 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1060 1262 28.7809% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1269 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1112 1252 30.1928% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1269 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1264 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1216 1251 33.0166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1213 1258 32.9351% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1145 1255 31.0888% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1235 1267 33.5324% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1217 1262 33.0437% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1272 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1272 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1262 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1261 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1139 1275 30.9259% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1261 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1260 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1035 1262 28.1021% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1070 1256 29.0524% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1223 1260 33.2066% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1239 1274 33.6411% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1259 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1045 1256 28.3736% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1267 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1102 1269 29.9213% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1221 1254 33.1523% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1274 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1222 1255 33.1795% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1210 1245 32.8537% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1202 1280 32.6364% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1132 1273 30.7358% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1061 1267 28.808% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1220 1258 33.1252% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1238 1273 33.6139% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1230 1266 33.3967% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1223 1256 33.2066% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1077 1258 29.2425% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 221.342 ms

Velocity-Bench QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 4.471940e-01 6.166440e-01 0.000000e+00
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 3.834540e-01 7.480910e-01 0.000000e+00
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 3.536240e-01 7.683790e-01 0.000000e+00
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 3.851130e-01 8.321520e-01 0.000000e+00
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 3.786370e-01 7.989550e-01 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 3.527660e-01 7.666940e-01 0.000000e+00
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 3.480930e-01 7.650850e-01 0.000000e+00
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 3.488780e-01 8.544240e-01 0.000000e+00
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 3.484750e-01 7.852570e-01 0.000000e+00
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 3.481720e-01 7.604510e-01 0.000000e+00

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.139e+07 1.139e+07 1.139e+07 0.000e+00 100.00
cycleInit 10 3.694e+06 3.694e+06 3.694e+06 0.000e+00 100.00
cycleTracking 10 7.696e+06 7.696e+06 7.696e+06 0.000e+00 100.00
cycleTracking_Kernel 104 4.941e+06 4.941e+06 4.941e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.160e+05 2.160e+05 2.160e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 4.200e+02 4.200e+02 4.200e+02 0.000e+00 100.00
Figure Of Merit 117.06 [Num Mega Segments / Cycle Tracking Time]

Velocity-Bench Sobel Filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 7.44486 s
sobelfilter - total time for whole calculation: 0.550363 s

Runtime_BlockedTransform_iter_256_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000339', '0.000341', '0.000329', '0.000329 0.000341 0.000349', '0.000010', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000095', '0.000084', '0.000077', '0.000077 0.000084 0.000123', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002712', '0.002748', '0.002529', '0.002529 0.002748 0.002858', '0.000167', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002655', '0.002573', '0.002562', '0.002562 0.002573 0.002831', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000497', '0.000495', '0.000484', '0.000484 0.000495 0.000512', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000409', '0.000405', '0.000405', '0.000405 0.000405 0.000416', '0.000007', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002771', '0.002686', '0.002300', '0.002300 0.002686 0.003328', '0.000519', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002824', '0.002591', '0.002512', '0.002512 0.002591 0.003370', '0.000474', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002445', '0.002421', '0.002410', '0.002410 0.002421 0.002506', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002574', '0.002543', '0.002440', '0.002440 0.002543 0.002739', '0.000152', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000521', '0.000513', '0.000495', '0.000495 0.000513 0.000554', '0.000031', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002515', '0.002492', '0.002465', '0.002465 0.002492 0.002589', '0.000065', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000207', '0.000156', '0.000122', '0.000122 0.000156 0.000345', '0.000120', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000565', '0.000556', '0.000518', '0.000518 0.000556 0.000621', '0.000052', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000329', '0.000361', '0.000264', '0.000264 0.000361 0.000362', '0.000056', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002449', '0.002424', '0.002306', '0.002306 0.002424 0.002617', '0.000157', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000288', '0.000273', '0.000261', '0.000261 0.000273 0.000330', '0.000037', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002559', '0.002513', '0.002472', '0.002472 0.002513 0.002691', '0.000116', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_1024

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_1024', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.002648', '0.000708', '0.000555', '0.000555 0.000708 0.006681', '0.003493', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_65536

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_65536', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002577', '0.002608', '0.002157', '0.002157 0.002608 0.002967', '0.000406', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000321', '0.000329', '0.000284', '0.000284 0.000329 0.000349', '0.000033', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000116', '0.000079', '0.000076', '0.000076 0.000079 0.000192', '0.000066', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002654', '0.002367', '0.002305', '0.002305 0.002367 0.003289', '0.000551', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000301', '0.000289', '0.000280', '0.000280 0.000289 0.000334', '0.000029', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002376', '0.002288', '0.002252', '0.002252 0.002288 0.002587', '0.000184', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002657', '0.002519', '0.002191', '0.002191 0.002519 0.003261', '0.000548', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000371', '0.000349', '0.000349', '0.000349 0.000349 0.000416', '0.000039', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002386', '0.002450', '0.002212', '0.002212 0.002450 0.002496', '0.000153', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002923', '0.002551', '0.002523', '0.002523 0.002551 0.003697', '0.000670', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000375', '0.000362', '0.000360', '0.000360 0.000362 0.000403', '0.000024', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002683', '0.002580', '0.002427', '0.002427 0.002580 0.003041', '0.000320', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002705', '0.002519', '0.002329', '0.002329 0.002519 0.003268', '0.000496', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_16384

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_16384', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002506', '0.002241', '0.002218', '0.002218 0.002241 0.003058', '0.000479', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_262144

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_262144', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002465', '0.002469', '0.002358', '0.002358 0.002469 0.002569', '0.000105', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_524288

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_524288', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002673', '0.002578', '0.002186', '0.002186 0.002578 0.003254', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002597', '0.002364', '0.002245', '0.002245 0.002364 0.003182', '0.000510', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_32768

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_32768', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002499', '0.002421', '0.002141', '0.002141 0.002421 0.002934', '0.000402', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_256

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_256', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.002352', '0.000341', '0.000186', '0.000186 0.000341 0.006528', '0.003618', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000397', '0.000400', '0.000352', '0.000352 0.000400 0.000438', '0.000043', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_64_blocksize_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_64_blocksize_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000327', '0.000306', '0.000303', '0.000303 0.000306 0.000372', '0.000039', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_256_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_256_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002720', '0.002469', '0.002258', '0.002258 0.002469 0.003434', '0.000627', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_8192

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_8192', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000356', '0.000347', '0.000334', '0.000334 0.000347 0.000386', '0.000027', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_512_blocksize_131072

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_512_blocksize_131072', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '8192', '819200', '0.002730', '0.002559', '0.002296', '0.002296 0.002559 0.003335', '0.000540', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_BlockedTransform_iter_128_blocksize_2048

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/blocked_transform --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/BlockedTransform_multi.csv --size=16384 --local=1024

Output:

['Runtime_BlockedTransform_iter_128_blocksize_2048', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '1024', '16384', '0.000366', '0.000354', '0.000349', '0.000349 0.000354 0.000396', '0.000026', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005600', '0.005600', '0.005592', '0.005592 0.005600 0.005607', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005819', '0.005818', '0.005794', '0.005794 0.005818 0.005844', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.007725', '0.006862', '0.006202', '0.006202 0.006862 0.010112', '0.002093', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005616', '0.005610', '0.005598', '0.005598 0.005610 0.005638', '0.000021', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004961', '0.004989', '0.004856', '0.004856 0.004989 0.005038', '0.000094', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006956', '0.006716', '0.006420', '0.006420 0.006716 0.007731', '0.000688', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.005425', '0.005390', '0.005282', '0.005282 0.005390 0.005604', '0.000164', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.006102', '0.006160', '0.005834', '0.005834 0.006160 0.006313', '0.000245', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_LocalMem_fp32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_fp32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000206', '0.000200', '0.000195', '0.000195 0.000200 0.000222', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_LocalMem_int32_4096

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LocalMem_multi.csv --size=512

Output:

['MicroBench_LocalMem_int32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.000244', '0.000229', '0.000210', '0.000210 0.000229 0.000294', '0.000044', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

MicroBench_L2_int32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000051', '0.000033', '0.000027', '0.000027 0.000033 0.000093', '0.000037', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000034', '0.000026', '0.000025', '0.000025 0.000026 0.000050', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000033', '0.000026', '0.000024', '0.000024 0.000026 0.000048', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000026', '0.000025', '0.000025 0.000026 0.000039', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_4

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_4', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000026', '0.000025', '0.000025 0.000026 0.000043', '0.000010', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_2

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_2', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000037', '0.000027', '0.000026', '0.000026 0.000027 0.000059', '0.000019', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000025', '0.000024', '0.000024 0.000025 0.000039', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_int32_8

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_int32_8', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000026', '0.000024', '0.000024 0.000026 0.000039', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000025', '0.000025', '0.000025 0.000025 0.000040', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_L2_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/pattern_L2 --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/L2_multi.csv

Output:

['MicroBench_L2_fp32_16', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000025', '0.000025', '0.000025 0.000025 0.000047', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000094', '0.000076', '0.000059', '0.000059 0.000076 0.000145', '0.000046', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000055', '0.000053', '0.000036', '0.000036 0.000053 0.000077', '0.000020', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000038', '0.000025', '0.000022', '0.000022 0.000025 0.000068', '0.000026', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000058', '0.000051', '0.000050', '0.000050 0.000051 0.000074', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000184', '0.000050', '0.000045', '0.000045 0.000050 0.000456', '0.000236', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_Reduction_multi.csv

Output:

['Pattern_Reduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000063', '0.000052', '0.000050', '0.000050 0.000052 0.000088', '0.000021', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000211', '0.000150', '0.000125', '0.000125 0.000150 0.000357', '0.000128', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000057', '0.000040', '0.000038', '0.000038 0.000040 0.000094', '0.000032', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000082', '0.000063', '0.000061', '0.000061 0.000063 0.000123', '0.000035', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000122', '0.000100', '0.000080', '0.000080 0.000100 0.000188', '0.000058', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000084', '0.000059', '0.000058', '0.000058 0.000059 0.000133', '0.000043', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ScalarProduct_multi.csv

Output:

['ScalarProduct_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000121', '0.000062', '0.000060', '0.000060 0.000062 0.000241', '0.000104', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000016', '0.000014', '0.000013', '0.000013 0.000014 0.000021', '0.000005', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000030', '0.000028', '0.000028', '0.000028 0.000028 0.000035', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000030', '0.000029', '0.000029 0.000030 0.000037', '0.000005', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000031', '0.000029', '0.000029', '0.000029 0.000029 0.000036', '0.000004', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000032', '0.000030', '0.000027', '0.000027 0.000030 0.000038', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000074', '0.000045', '0.000032', '0.000032 0.000045 0.000146', '0.000063', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000036', '0.000027', '0.000025', '0.000025 0.000027 0.000056', '0.000017', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Pattern_SegmentedReduction_multi.csv

Output:

['Pattern_SegmentedReduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000023', '0.000016', '0.000015', '0.000015 0.000016 0.000038', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.080975', '0.070840', '0.070463', '0.070463 0.070840 0.101623', '0.017882', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.032507', '0.033718', '0.029613', '0.029613 0.033718 0.034190', '0.002517', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Latency_fp32_out_of_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['USM_Latency_fp32_out_of_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.045884', '0.046807', '0.043570', '0.043570 0.046807 0.047275', '0.002017', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

SYCL2020_Accessors_Latency_fp32_in_order__

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_accessors_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Latency_multi.csv

Output:

['SYCL2020_Accessors_Latency_fp32_in_order__', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.068566', '0.068426', '0.068162', '0.068162 0.068426 0.069112', '0.000491', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_host

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_host', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000003', '0.000002', '0.000002', '0.000002 0.000002 0.000004', '0.000001', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_device

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_device', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000043', '0.000008', '0.000002', '0.000002 0.000008 0.000118', '0.000065', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_shared

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Allocation_latency_multi.csv

Output:

['USM_Allocation_latency_fp32_shared', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000153', '0.000118', '0.000109', '0.000109 0.000118 0.000234', '0.000070', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001795', '0.001793', '0.001766', '0.001766 0.001793 0.001826', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.003148', '0.003117', '0.003050', '0.003050 0.003117 0.003277', '0.000117', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.014177', '0.014177', '0.014177', '0.014177', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.004681', '0.003279', '0.003090', '0.003090 0.003279 0.007673', '0.002593', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_with_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015341', '0.015341', '0.015341', '0.015341', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001881', '0.001882', '0.001871', '0.001871 0.001882 0.001889', '0.000009', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_no_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.013774', '0.013774', '0.013774', '0.013774', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Instr_Mix_multi.csv

Output:

['USM_Instr_Mix_fp32_shared_1:1mix_no_init_with_prefetch', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.015494', '0.015494', '0.015494', '0.015494', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000124', '0.000015', '0.000009', '0.000009 0.000015 0.000346', '0.000193', '1.217586', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_NonPinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000486', '0.000438', '0.000162', '0.000162 0.000438 0.000858', '0.000351', '0.070616', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_DeviceHost_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000029', '0.000019', '0.000018', '0.000018 0.000019 0.000050', '0.000018', '0.634443', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/usm_pinned_overhead --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/USM_Pinned_Overhead_multi.csv

Output:

['USM_Pinned_Overhead_fp32_HostDevice_Pinned_Init_1', 'N/A', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000046', '0.000011', '0.000009', '0.000009 0.000011 0.000118', '0.000062', '1.222137', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000011']

VectorAddition_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000040', '0.000033', '0.000029', '0.000029 0.000033 0.000059', '0.000016', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000049', '0.000043', '0.000039', '0.000039 0.000043 0.000067', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/VectorAddition_multi.csv

Output:

['VectorAddition_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000068', '0.000037', '0.000031', '0.000031 0.000037 0.000135', '0.000058', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2DConvolution

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2DConvolution --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2DConvolution.csv

Output:

['Polybench_2DConvolution', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000235', '0.000229', '0.000213', '0.000213 0.000229 0.000262', '0.000025', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/2mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/2mm.csv --size=512

Output:

['Polybench_2mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001250', '0.001239', '0.001235', '0.001235 0.001239 0.001276', '0.000022', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_3mm

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/3mm --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/3mm.csv --size=512

Output:

['Polybench_3mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001752', '0.001746', '0.001727', '0.001727 0.001746 0.001782', '0.000028', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_Arith_fp32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_fp32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000045', '0.000032', '0.000029', '0.000029 0.000032 0.000073', '0.000025', '1081.015636', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

MicroBench_Arith_int32_512

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/arith --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Arith_int32_512.csv --size=16384

Output:

['MicroBench_Arith_int32_512', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.000154', '0.000073', '0.000059', '0.000059 0.000073 0.000329', '0.000152', '527.595347', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.031250']

Polybench_Atax

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atax --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Atax.csv --size=8192

Output:

['Polybench_Atax', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.006899', '0.006903', '0.006883', '0.006883 0.006903 0.006912', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000047', '0.000040', '0.000038', '0.000038 0.000040 0.000062', '0.000013', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000076', '0.000042', '0.000036', '0.000036 0.000042 0.000149', '0.000063', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_int64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000276', '0.000041', '0.000036', '0.000036 0.000041 0.000750', '0.000411', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ReductionAtomic_fp64

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/atomic_reduction --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/ReductionAtomic_fp64.csv

Output:

['ReductionAtomic_fp64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000262', '0.000043', '0.000039', '0.000039 0.000043 0.000704', '0.000383', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Bicg

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/bicg --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Bicg.csv --size=20480

Output:

['Polybench_Bicg', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.005132', '0.005132', '0.005132', '0.005132', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Correlation

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/correlation --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Correlation.csv --size=2048

Output:

['Polybench_Correlation', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.095493', '0.095493', '0.095493', '0.095493', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Covariance

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/covariance --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Covariance.csv --size=2048

Output:

['Polybench_Covariance', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.094557', '0.094557', '0.094557', '0.094557', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gesummv

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gesummv --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gesummv.csv --size=8192

Output:

['Polybench_Gesummv', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.007312', '0.007312', '0.007312', '0.007312', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Gramschmidt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/gramschmidt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Gramschmidt.csv --size=512

Output:

['Polybench_Gramschmidt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.285060', '0.285060', '0.285060', '0.285060', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Kmeans_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/kmeans --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Kmeans.csv --size=700000000

Output:

['Kmeans_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '67108864', '0.001798', '0.001795', '0.001767', '0.001767 0.001795 0.001833', '0.000033', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegressionCoeff_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_coeff --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegressionCoeff.csv --size=1638400000

Output:

['LinearRegressionCoeff_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.001656', '0.001383', '0.000716', '0.000716 0.001383 0.002868', '0.001101', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegression_fp32

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/lin_reg_error --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/LinearRegression.csv --size=640000

Output:

['LinearRegression_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000387', '0.000363', '0.000349', '0.000349 0.000363 0.000449', '0.000054', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MatmulChain

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/matmulchain --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MatmulChain.csv --size=2048

Output:

['MatmulChain', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.011043', '0.011043', '0.011043', '0.011043', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MolecularDynamics

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mol_dyn --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/MolecularDynamics.csv --size=8196

Output:

['MolecularDynamics', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '3072', '0.000080', '0.000066', '0.000057', '0.000057 0.000066 0.000118', '0.000033', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Mvt

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/mvt --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Mvt.csv --size=32767

Output:

['Polybench_Mvt', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '16384', '0.003648', '0.003648', '0.003648', '0.003648', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_sf_fp32_16

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/sf --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/sf_16.csv --size=--size=100000000

Output:

['MicroBench_sf_fp32_16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '0', '0.000041', '0.000025', '0.000021', '0.000021 0.000025 0.000076', '0.000031', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.000000']

Polybench_Syr2k

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syr2k --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syr2k.csv --size=6144

Output:

['Polybench_Syr2k', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.006356', '0.006356', '0.006356', '0.006356', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Syrk

Environment Variables:

Command:

/home/test-user/bench_workdir/sycl-bench-build/syrk --warmup-run --num-runs=3 --output=/home/test-user/bench_workdir/Syrk.csv --size=4096

Output:

['Polybench_Syrk', 'FAIL', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024', '0.003222', '0.003222', '0.003222', '0.003222', '0.000000', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

@pbalcer pbalcer merged commit d9d24ec into oneapi-src:main Sep 30, 2024
63 of 70 checks passed
Copy link

github-actions bot commented Oct 7, 2024

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/11214295994

Copy link

github-actions bot commented Oct 7, 2024

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/11214295994
Job status: failure. Test status: skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/cd Continuous integration/devliery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants