-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][CUDA] Improve kernel launch error handling for out-of-registers #12604
Merged
sommerlukas
merged 16 commits into
intel:sycl
from
GeorgeWeb:georgi/sycl-cuda-out-of-resources-registers-error
Jun 3, 2024
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
c9d329b
[SYCL][CUDA] Improve kernel launch error handling for out-of-registers
GeorgeWeb 5b7cbec
Add default fallback error msg for out-of-resources
GeorgeWeb cd4a8f8
Cleanup test
GeorgeWeb 7606868
Override UR tag to fetch for testing
GeorgeWeb 95d8ced
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb 4d6eb99
Update UR tag
GeorgeWeb 7e258e9
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb 90b4ab3
Address review comments
GeorgeWeb 6fcadf4
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb 1837e8f
Update use of sycl1.2.1 runtime_error to sycl2020 exception
GeorgeWeb 6f63a73
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb 179f750
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb d7453dd
Update UR commit tag
GeorgeWeb 9bcbe59
Merge remote-tracking branch 'upstream/sycl' into georgi/sycl-cuda-ou…
GeorgeWeb fbae8ea
Update UR tag
GeorgeWeb a868e0f
Revert overriding of FetchContent for UR
GeorgeWeb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GeorgeWeb, I was looking at the PR and got confused by it. Could you please clarify what is the key change here? PR description says that we used to display confusing error to users, but check for the exception message wasn't changed by the PR.
What was the confusing part then? I see that in #12363 we had a bug report which contains invalid message, but this test had been introduced a year ago before that issue was submitted in #9106. What am I missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In summary the wrong part of the message as per the PRs description was just the plugin error code description -
PI_ERROR_INVALID_WORK_GROUP_SIZE
should have beenPI_ERROR_OUT_OF_LAUNCH_RESOURCES
. The reason I've added theerrc::nd_range
error code check was just for more verbosity but it wasn't of importance here.The real issue about "reporting a completely wrong message" in the OPs report (#12363) was due to a mistake on this line https://github.com/intel/llvm/pull/9106/files#diff-7525901710934f7bdb2ad36238c4b67163f112d3bd233db7af0b0078b5b01e80R3263 which was fixed by this UR cuda change oneapi-src/unified-runtime#1299