-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP profiling submission time query returns weird values #12904
Comments
Hi @steffenlarsen, I've been trying to reproduce this issue and have been playing with the fix for it but now I concluded that we're already doing what you've suggested. I think the relevant line in the hip adapter is here: https://github.com/oneapi-src/unified-runtime/blob/a504ead8b9fa3b70ca90d40e21bd417eee2f204b/source/adapters/hip/event.cpp#L64 We're calling So far I've been trying to check if the following two asynchronous events would give the same timings, even if we change their order: event e = q.submit([&](handler &h) {
h.parallel_for(array_size_small, [=](id<1> i) {
sum[i] = a[i] + b[i];
});
});
event e2 = q.submit([&](handler &h) {
h.parallel_for(array_size_big, [=](id<1> i) {
sumTwo[i] = c[i] + d[i];
});
}); Assuming that the second event runs much longer than the first one if we're using the NULL stream, the first event will wait until the second is finished (according to what you've found above). But that's not what I've seen, the kernel with array_size_small gives the same timing results no matter if it runs in the first or the second event. So the root cause might be somewhat more complicated. |
@konradkusiak97 - I do not know for certain if the exact issue is as I described, but I would say the quote is a glaring issue compared to the semantics of the corresponding CUDA interfaces. The intention of Arguably For tests showing weird behavior, you could try enabling sycl/test-e2e/ProfilingTag tests for HIP. |
You're right, for |
Thank you, @konradkusiak97! Would you mind checking if we can reenable the profiling tag tests on HIP? |
Checked it a while ago actually, they all seem to pass on |
oneapi-src/unified-runtime#1634 is believed to have fixed the issues for HIP in the profiling tag extension. This commit reenables the tests for HIP. Closes intel#12904. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Describe the bug
In #12838 it seems like the submission time on HIP is giving weird values. I did a bit of digging and it seems to me like HIP is a little different from CUDA when checking timing-differences between events. Of particular interest here is the following line for hipEventElapsedTime():
While what we expect here is to get an event with the current time, hence using an otherwise unused stream. Fixing it might be outside the scope of this PR, but a possible solution could be to lazily have a stream specifically for recording submission time of events, tied to the context. Similar could be used in the CUDA backend to avoid the assumption noted above.
Originally posted by @steffenlarsen in oneapi-src/unified-runtime#1400 (comment)
To reproduce
No response
Environment
No response
Additional context
This affects #12838, but should be reproducible on normal profiling queues. When this is fixed, sycl/test-e2e/ProfilingTag/ tests should be enabled for HIP.
The text was updated successfully, but these errors were encountered: