Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Throw an exception when usm_shared_allocations aspect not supported #12520

Closed
wants to merge 24 commits into from

Conversation

lbushi25
Copy link
Contributor

@lbushi25 lbushi25 commented Jan 29, 2024

This PR aligns the behavior of malloc_shared with the spec by having the function throw an exception if the device does not support the usm_device_allocations aspect. The responsible changes are located in usm_impl.cpp file. The rest of the modified files are tests that were also not conforming to the spec and were not expecting exceptions to be thrown on failure of malloc_shared. As a result, these tests were failing under this new change with an unhandled exception error.

These tests have been changed in one of two ways.
The vast majority of these tests depended almost entirely on the result of malloc_shared so I have disabled these tests on devices which do not support the usm_shared_allocations aspect by prepending // REQUIRES: usm_device_allocations.
The few remaining tests only used malloc_shared in subtests or malloc_shared was not a criticial point in the test and so I have only disabled these sections that use malloc_shared using if statements that check whether the device supports the aspect and this allows us to keep most of the test available even in devices that do not support usm_shared_allocations.

@lbushi25 lbushi25 changed the title Throw an exception when usm_shared_allocations aspect not supported [SYCL] Throw an exception when usm_shared_allocations aspect not supported Jan 29, 2024
Copy link
Contributor

github-actions bot commented Jan 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@lbushi25 lbushi25 marked this pull request as ready for review February 1, 2024 18:19
@lbushi25 lbushi25 requested review from a team as code owners February 1, 2024 18:19
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was actually performing the aspect check however the reason it was failing was because it had some common repetition-caused bugs such as below where shared should be device as described by the comment above.

Copy link
Contributor

@victor-eds victor-eds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KernelFusion test changes LGTM

Copy link
Contributor

@Alcpz Alcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCLcompat changes LGTM

@@ -1,3 +1,4 @@
// REQUIRES: usm_shared_allocations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be aspect-usm_shared_allocations?

aspect_features = set("aspect-" + a for a in aspects)
sg_size_features = set("sg-" + s for s in sg_sizes)
features = set()
features.update(aspect_features)

Copy link
Contributor Author

@lbushi25 lbushi25 Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that most tests were using aspect-{ASPECT_NAME} format but it seems that this format works as well. Indeed, if i remove it the tests go back to failing. Perhaps @againull can confirm that both work, that is, using aspect-usm_shared_allocations or just using usm_shared_allocations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it works, most likely it will never be run. One can achieve the same by REQUIRES: dont-ever-run-me.

@@ -1,3 +1,4 @@
// REQUIRES: usm_shared_allocations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes like this effectively mean that we are potentially reducing validation coverage for a feature, simply because another feature doesn't work. That doesn't look right, especially considering that using USM for kernel-device communication is not the only way and using regular buffer + accessor is actually guaranteed to always work.

I think that instead of doing REQUIRES, we should do a rewrite so that tests are always launched and they do not depend on USM shared allocations (unless their purpose is to test that kind of allocations)

Copy link
Contributor Author

@lbushi25 lbushi25 Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comment about rewriting usm shared allocations with buffers convinced me that my PR is probably not the best way to move forward long term. That said though, this now means this JIRA and the PR will have to be blocked until all those tests are rewritten. I personally think since there are many such tests that it is better to just create JIRAs for these tests by grouping tests in the same directory together and we can probably split these JIRAs among several people to speed it up or if its not time sensitive I can just take them all in my backlog and start working one by one.

@@ -40,7 +40,8 @@ TEST_F(SchedulerTest, InOrderQueueHostTaskDeps) {

context Ctx{Plt};
queue InOrderQueue{Ctx, default_selector_v, property::queue::in_order()};

if (!InOrderQueue.get_device().has(aspect::usm_shared_allocations))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need any changes in unit-tests? Those are run on mock devices and we can just make mock device support necessary APIs to simplify things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, was not aware of that, thanks!

[&]() { return MShared(d, ctx); },
[&]() { return MShared(q, property_list{}); },
[&]() { return MShared(d, ctx, property_list{}); }});
if (d.has(aspect::usm_shared_allocations) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, won't d.has(...) and q.get_device().has(...) be equivalent?

@lbushi25
Copy link
Contributor Author

lbushi25 commented Feb 6, 2024

Closing this as a different plan has been laid out that consists of modifying the tests with a separate PR and then pushing only the malloc_shared change in its own PR.

@lbushi25 lbushi25 closed this Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants