-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for sycl_ext_oneapi_composite_device #1192
Initial support for sycl_ext_oneapi_composite_device #1192
Conversation
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1192 +/- ##
==========================================
- Coverage 15.80% 15.79% -0.02%
==========================================
Files 223 223
Lines 31483 31513 +30
Branches 3558 3561 +3
==========================================
+ Hits 4975 4976 +1
- Misses 26457 26486 +29
Partials 51 51 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HIP and CUDA adapter changes look good to me.
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
@pbalcer @aarongreig I'm unable to reproduce the CI errors in my local machine. Any tips on how to reproduce? |
if (!isCombinedMode) { | ||
ze_device_handle_t RootDev = nullptr; | ||
// Query Root Device | ||
ZE2UR_CALL(zeDeviceGetRootDevice, (D->ZeDevice, &RootDev)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new entry point? I'm wondering if the GitHub Actions failures could be due to an older driver being installed or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kbenzie It is a new entry point, but as far as I see, the CI uses level zero v1.15.1 which includes this entry. What I'm seeing is that zeDeviceGetRootDevice
returns ZE_RESULT_ERROR_UNSUPPORTED_FEATURE
under certain circumstances (https://github.com/oneapi-src/level-zero/blob/1685d01497428ca4d8b99200972b64685424d5c9/source/lib/ze_libapi.cpp#L463). The CI is running into this, and that makes ZE2UR_CALL
fail in UR_CHECK
. I'm not sure how to deal with this, honestly. As of now, I think only PVC should care about this, but I don't think we should have code specific to PVC in UR. Otherwise, we can use ZE_CALL_NOCHECK
rather than ZE2UR_CALL
and check manually that the error code is either success or unsupported feature. What do you think? Any other suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I just checked locally using ZE_CALL_NOCHECK
and all the tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking the result of ZE_CALL_NOCHECK
manually sounds acceptable if there's a safe fallback, I think the adapter does that in other places where stuff might not be supported. @oneapi-src/unified-runtime-level-zero-write should be consulted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 15d2953
I believe we test L0 on a machine with an A750. |
Thanks @kbenzie. @pbalcer gave me access to a machine with such hardware. I'm trying to reproduce the issue in that machine now. |
@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-native-cpu-write can I have a review please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Native CPU lgtm, sorry for the late review
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just one minor nit.
Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for level zero
I think this is ready to merge @oneapi-src/unified-runtime-maintain. |
could you update https://github.com/intel/llvm/pull/12178/files#diff-1bf2ed05c0065126d4e2150f0eda4dbd561e5c3625452e96356aadafa1312e6fR64 to point at https://github.com/maarquitos14/unified-runtime/tree/maronas/ext_composite_device so the changes here get tested @maarquitos14 (assuming that PR is going to need to pull the changes in this PR in) |
Done in intel/llvm@c559036. |
thanks, you'll need to make sure the CI on intel/llvm side is green as well before we can merge this |
CI on intel/llvm is finally all green. How should we proceed now? |
The ready to merge label represents a queue, when we get to this PR we'll just merge it (assuming no conflicts or whatever come up between now and then) and then ping you on the llvm PR to update the UR tag. You should try to get approvals on the LLVM PR so it doesn't hold up subsequent merges when this PR goes in |
Initial implementation to support `sycl_ext_oneapi_composite_device` specified in #11846. Depends on oneapi-src/unified-runtime#1192. --------- Signed-off-by: Maronas, Marcos <marcos.maronas@intel.com> Signed-off-by: Marcos Maronas <marcos.maronas@intel.com>
Initial implementation to support
sycl_ext_oneapi_composite_device
specified in intel/llvm#11846.