Re-implement SYCL backend parallel_for
to improve bandwidth utilization
#210
Job | Run time |
---|---|
15m 41s | |
11s | |
18m 6s | |
24m 44s | |
4m 29s | |
16m 43s | |
12m 9s | |
12m 17s | |
10m 2s | |
6m 11s | |
9m 16s | |
9m 12s | |
11m 12s | |
8m 53s | |
11m 24s | |
11m 53s | |
5m 4s | |
27m 10s | |
18m 0s | |
18m 19s | |
4h 10m 56s |