-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge changes from multi device compile extension into core spec. #1195
base: main
Are you sure you want to change the base?
Merge changes from multi device compile extension into core spec. #1195
Conversation
6c4b652
to
3b9b0d8
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1195 +/- ##
==========================================
- Coverage 15.46% 15.37% -0.10%
==========================================
Files 238 238
Lines 33883 33689 -194
Branches 3747 3714 -33
==========================================
- Hits 5239 5178 -61
+ Misses 28593 28461 -132
+ Partials 51 50 -1 ☔ View full report in Codecov by Sentry. |
3b9b0d8
to
f4bbddf
Compare
f26711e
to
38c3936
Compare
Can you add a description for this PR please? From my experience of CUDA/HIP Outside of MPI which is a completely different mechanism, I am not aware of any multi-device compilation process where the output of a single compilation is to produce different binaries for each of the device. I don't know of any CUDA/HIP documentation on how to deal with this use case. Therefore, if you could clarify things like:
I think that would help with review, allowing me to make a deeper investigation of any potential backend specific issues. Otherwise the only thing I can check is that it doesn't break the standard case where you are just outputting a single binary for all devices. |
The extension was a last minute fix for issues level zero ended up having with a redesign, this basically fully reverts the redesign by merging the extension. |
Is there an accompanying PR in intel/llvm to change this line https://github.com/intel/llvm/blob/f4b4a84b653fb77a9db834d4f588ac347681ea30/sycl/plugins/unified_runtime/pi2ur.hpp#L2068 |
..not doing a good job of following my own project's process here |
38c3936
to
46722a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Also makes things cleaner in OpenCL 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for level zero, cleans up things greatly as long as we are not breaking customers that were expecting this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Native CPU LGTM, thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
46722a5
to
b5c3e7c
Compare
b5c3e7c
to
d7a1984
Compare
d7a1984
to
89abf48
Compare
e635cac
to
1af8237
Compare
We originally changed this interface a bit from PI: currently these entry points take a context which represents your list of devices. This turned out to cause issues for the level zero adapter so the extension was created to effectively revert back to the PI/CL style interface. This PR reverts the core interface back to PI style, everything takes an explicit device list instead of a context.
LLVM testing intel/llvm#12536