forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Graph] Fix access modes not being respected #359
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
Bensuo
commented
Mar 7, 2024
- Fix access modes not being respected and creating unnecessary edges
- Update printing E2E tests since output has changed
- Add unit tests for access modes
…2951) Fix for intel#12177 Now even if the waitlist consists of only empty events we still invoke the scheduler to submit barrier command. Scheduler creates sycl event with associated Barrier Command but doesn't actually enqueue a barrier because waitlist is empty (consists of only empty events). As a result, the event returned by such barrier doesn't have PI handle but it supposed to have. If such event is then passed to some another command later as a dependency then it leads to nullptr dereference. So, filter out such events from waitlist to avoid going to sheduler in such scenarios. Test provided by Andrey Alekseenko <andrey.alekseenko@scilifelab.se> Co-authored-by: Andrey Alekseenko <andrey.alekseenko@scilifelab.se>
…l#12949) Currently if ext_oneapi_barrier without waitlist is submitted to the in-order queue that doesn't have the last command (empty queue) then we return default constructed event which doesn't have profiling info because it is not associated with any queue. Associate such event with the queue and record submission time which is equal to the start time and the end time for such event because it basically corresponds to NOP.
…tel#12947) If it's no RDC, the extension will definitely be BC. We need to switch the condition order. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
… implementation (intel#12723) Test changes to the UR CUDA/HIP adapters in oneapi-src/unified-runtime#1340 that fix Coverity issues in the implementation of kernel command update in a command-buffer. --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
This PR adds the value_type alias for the data type of a sycl::vec object as described in the spec. --------- Co-authored-by: Alexey Bader <alexey.bader@intel.com>
…ntel#12940) This PR changes the git tag for the oneAPI Construction Kit, pointing to a commit in the `main` branch. It also adds CMake variables to either use a local checkout of the OCK, or overwrite the git repo/tag used by `FetchContent`.
julianmi
reviewed
Mar 11, 2024
The graph extension tests are currently skipped during execution for devices which don't support the graphs extension. However, this early return causes the tests to be reported as passed and makes it hard from looking at the results to know if the tests actually stressed the graphs code or not. Improved this situation by changing the SYCL-Graph device info query to an aspect such that `sycl-ls --verbose` will output `ext_oneapi_graph` for supported devices. This can then be used to inform the LIT config and set a requirement for tests, enabling the tests to be obviously skipped for devices that don't support graphs. To enable setting this requirement in `lit.local.cfg` files some extra directories have been created, in particular `UnsupportedDevice` which contains tests that don't have a requirement as the tests verify expected errors are thrown when using the graphs API with unsupported devices. The removal of the device info query means that we can no longer report if a device emulates support for SYCL-Graph, however we currently have no such implementations as they haven't yet deemed to provide enough value. This is technically an ABI breaking change however due to the removal of symbols, but SYCL-Graph is currently an experimental extension so such changes may be permitted.
) Add a test for demangle kernel name for UR PR oneapi-src/unified-runtime#1355
This commit splits the vec_binary_scalar_order test to improve coverage and reduce the compile-time overhead. --------- Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
This PR prepares SYCLcompat to receive multiple PRs containing updates to the helper headers. It will substitute the previous opened PR as it grew in scope too much, as in case of issues it would be difficult to track and resolve effectively the problem. Edit: Previous PR was intel#11267
This PR cleans up the API of the sub-group class by either removing functions that do not appear in the spec or marking them deprecated. The only exception is the set of load/store functions, which after some discussion with the spec team, it was suggested to leave them as is for now.
…tel#12915) Current SYCL device sanitizer can be ported to other offloading solution such as openmp but sanitizer libdevice uses some SYCL specific macro or code which is only valid in SYCL scenario, this PR splits these code/macros. --------- Signed-off-by: jinge90 <ge.jin@intel.com>
E2E are ignored since we added "tests_selector: cts"
EwanC
approved these changes
Mar 11, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would be worth referencing the issue URL intel#12473 in the PR description when opening this upstream.
- Removes printing to cout from check for ext_oneapi_graph - Caused garbled output when running sycl-ls and LIT
Adds a templated wrapper to sycl::length in syclcompat, that wraps over `sycl::vec<ValueT,N>` up to N == 4, and calculates the length with N > 4. --------- Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>
…fpga_cluster kernel property (intel#12453) Implement task_sequence header, task_sequence properties, and fpga_cluster kernel property according to the spec updates intel#6348 (almost ready to be merged) Header - Refactor invocation and response capacity to be on the Create intrinsic, so Async and Get do not take in respective argument - Remove max_outstanding argument of Create intrinsic - Add pipelined and fpga_cluster arguments to Create intrinsic - Async, Get, and Release intrinsic now accept the TargetExtType target("spirv.TaskSequenceINTEL") returned from the Create intrinsic, and do not take in object pointer or task function pointer arguments - Task Sequence accepts properties Task Sequence Properties - Convert invocation and response capacity from template parameters to properties - Add balanced property to remove Get intrinsic loop in destructor FPGA Kernel properties - Add fpga_cluster property - Support fpga_cluster and pipelined property for task sequence
…#12989) Test UR PR oneapi-src/unified-runtime#1426 that default initializes CUDA_KERNEL_NODE_PARAMS to prevents errors from trying to set data members which are not present before version 12.0 --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
julianmi
approved these changes
Mar 12, 2024
This PR adds multiple binary and unary ops through callable structs: - abs - abs_diff - add_sat - rhadd - hadd - maximum - minimum - sub_sat It also adds `vectorized_sum_abs_diff`. To facilitate testing, it also incorporates a set of fixtures/helper classes to simplify future additions to math.hpp
…ntel#12977) Adds helpers to calculate the occupancy of the GPU Testing currently only addresses API.
Enable more tests. The rest takes too much time. The current set takes ~50min to build and run. See: https://github.com/intel/llvm/actions/runs/8247563907/job/22555980586
Depends on intel#12965 merging first. --------- Signed-off-by: Raiyan Latif <raiyan.latif@intel.com> Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Treat warnings as errors, and fix the last remaining warnings in the fusion library. --------- Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
- Fix access modes not being respected and creating unnecessary edges - Update printing E2E tests since output has changed - Add unit tests for access modes
Bensuo
force-pushed
the
ben/accessor-modes
branch
from
March 13, 2024 15:15
5474b6a
to
d89a3ea
Compare
Closed in favour of upstream PR: intel#13011 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.