-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Throw an exception when usm_shared_allocations aspect not supported #12520
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was actually performing the aspect check however the reason it was failing was because it had some common repetition-caused bugs such as below where shared should be device as described by the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KernelFusion
test changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCLcompat changes LGTM
@@ -1,3 +1,4 @@ | |||
// REQUIRES: usm_shared_allocations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be aspect-usm_shared_allocations
?
Lines 665 to 668 in d3d6e78
aspect_features = set("aspect-" + a for a in aspects) | |
sg_size_features = set("sg-" + s for s in sg_sizes) | |
features = set() | |
features.update(aspect_features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I noticed that most tests were using aspect-{ASPECT_NAME} format but it seems that this format works as well. Indeed, if i remove it the tests go back to failing. Perhaps @againull can confirm that both work, that is, using aspect-usm_shared_allocations or just using usm_shared_allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it works, most likely it will never be run. One can achieve the same by REQUIRES: dont-ever-run-me
.
@@ -1,3 +1,4 @@ | |||
// REQUIRES: usm_shared_allocations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes like this effectively mean that we are potentially reducing validation coverage for a feature, simply because another feature doesn't work. That doesn't look right, especially considering that using USM for kernel-device communication is not the only way and using regular buffer + accessor is actually guaranteed to always work.
I think that instead of doing REQUIRES
, we should do a rewrite so that tests are always launched and they do not depend on USM shared allocations (unless their purpose is to test that kind of allocations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment about rewriting usm shared allocations with buffers convinced me that my PR is probably not the best way to move forward long term. That said though, this now means this JIRA and the PR will have to be blocked until all those tests are rewritten. I personally think since there are many such tests that it is better to just create JIRAs for these tests by grouping tests in the same directory together and we can probably split these JIRAs among several people to speed it up or if its not time sensitive I can just take them all in my backlog and start working one by one.
@@ -40,7 +40,8 @@ TEST_F(SchedulerTest, InOrderQueueHostTaskDeps) { | |||
|
|||
context Ctx{Plt}; | |||
queue InOrderQueue{Ctx, default_selector_v, property::queue::in_order()}; | |||
|
|||
if (!InOrderQueue.get_device().has(aspect::usm_shared_allocations)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need any changes in unit-tests? Those are run on mock devices and we can just make mock device support necessary APIs to simplify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, was not aware of that, thanks!
[&]() { return MShared(d, ctx); }, | ||
[&]() { return MShared(q, property_list{}); }, | ||
[&]() { return MShared(d, ctx, property_list{}); }}); | ||
if (d.has(aspect::usm_shared_allocations) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, won't d.has(...) and q.get_device().has(...) be equivalent?
Closing this as a different plan has been laid out that consists of modifying the tests with a separate PR and then pushing only the malloc_shared change in its own PR. |
This PR aligns the behavior of malloc_shared with the spec by having the function throw an exception if the device does not support the usm_device_allocations aspect. The responsible changes are located in usm_impl.cpp file. The rest of the modified files are tests that were also not conforming to the spec and were not expecting exceptions to be thrown on failure of malloc_shared. As a result, these tests were failing under this new change with an unhandled exception error.
These tests have been changed in one of two ways.
The vast majority of these tests depended almost entirely on the result of malloc_shared so I have disabled these tests on devices which do not support the usm_shared_allocations aspect by prepending // REQUIRES: usm_device_allocations.
The few remaining tests only used malloc_shared in subtests or malloc_shared was not a criticial point in the test and so I have only disabled these sections that use malloc_shared using if statements that check whether the device supports the aspect and this allows us to keep most of the test available even in devices that do not support usm_shared_allocations.