Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Fix access modes not being respected #359

Closed
wants to merge 29 commits into from

Conversation

Bensuo
Copy link
Collaborator

@Bensuo Bensuo commented Mar 7, 2024

  • Fix access modes not being respected and creating unnecessary edges
  • Update printing E2E tests since output has changed
  • Add unit tests for access modes

@Bensuo Bensuo added the Graph Implementation Related to DPC++ implementation and testing label Mar 7, 2024
againull and others added 6 commits March 8, 2024 12:29
…2951)

Fix for intel#12177
Now even if the waitlist consists of only empty events we still invoke
the scheduler to submit barrier command. Scheduler creates sycl event
with associated Barrier Command but doesn't actually enqueue a barrier
because waitlist is empty (consists of only empty events). As a result,
the event returned by such barrier doesn't have PI handle but it
supposed to have. If such event is then passed to some another command
later as a dependency then it leads to nullptr dereference.

So, filter out such events from waitlist to avoid going to sheduler in
such scenarios.

Test provided by Andrey Alekseenko <andrey.alekseenko@scilifelab.se>
Co-authored-by:  Andrey Alekseenko <andrey.alekseenko@scilifelab.se>
…l#12949)

Currently if ext_oneapi_barrier without waitlist is submitted to the
in-order queue that doesn't have the last command (empty queue) then we
return default constructed event which doesn't have profiling info
because it is not associated with any queue. Associate such event with
the queue and record submission time which is equal to the start time
and the end time for such event because it basically corresponds to NOP.
…tel#12947)

If it's no RDC, the extension will definitely be BC. We need to switch
the condition order.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
… implementation (intel#12723)

Test changes to the UR CUDA/HIP adapters in
oneapi-src/unified-runtime#1340 that fix
Coverity issues in the implementation of kernel command update in a
command-buffer.

---------

Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
This PR adds the value_type alias for the data type of a sycl::vec
object as described in the spec.

---------

Co-authored-by: Alexey Bader <alexey.bader@intel.com>
…ntel#12940)

This PR changes the git tag for the oneAPI Construction Kit, pointing to
a commit in the `main` branch. It also adds CMake variables to either
use a local checkout of the OCK, or overwrite the git repo/tag used by
`FetchContent`.
EwanC and others added 8 commits March 11, 2024 11:52
The graph extension tests are currently skipped during execution for
devices which don't support the graphs extension. However, this early
return causes the tests to be reported as passed and makes it hard from
looking at the results to know if the tests actually stressed the graphs
code or not.

Improved this situation by changing the SYCL-Graph device info query to
an aspect such that `sycl-ls --verbose` will output `ext_oneapi_graph`
for supported devices. This can then be used to inform the LIT config
and set a requirement for tests, enabling the tests to be obviously
skipped for devices that don't support graphs.

To enable setting this requirement in `lit.local.cfg` files some extra
directories have been created, in particular `UnsupportedDevice` which
contains tests that don't have a requirement as the tests verify
expected errors are thrown when using the graphs API with unsupported
devices.

The removal of the device info query means that we can no longer report
if a device emulates support for SYCL-Graph, however we currently have
no such implementations as they haven't yet deemed to provide enough
value. This is technically an ABI breaking change however due to the
removal of symbols, but SYCL-Graph is currently an experimental
extension so such changes may be permitted.
This commit splits the vec_binary_scalar_order test to improve coverage
and reduce the compile-time overhead.

---------

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
This PR prepares SYCLcompat to receive multiple PRs containing updates
to the helper headers.

It will substitute the previous opened PR as it grew in scope too much,
as in case of issues it would be difficult to track and resolve
effectively the problem.

Edit: Previous PR was intel#11267
This PR cleans up the API of the sub-group class by either removing
functions that do not appear in the spec or marking them deprecated. The
only exception is the set of load/store functions, which after some
discussion with the spec team, it was suggested to leave them as is for
now.
…tel#12915)

Current SYCL device sanitizer can be ported to other offloading solution
such as openmp but sanitizer libdevice uses some SYCL specific macro or
code which is only valid in SYCL scenario, this PR splits these
code/macros.

---------

Signed-off-by: jinge90 <ge.jin@intel.com>
E2E are ignored since we added "tests_selector: cts"
Copy link
Collaborator

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, would be worth referencing the issue URL intel#12473 in the PR description when opening this upstream.

bso-intel and others added 6 commits March 11, 2024 16:30
)

The get_info<> API returns a std::string which should be handled through
detail::string, ABI-neutral way.

---------

Signed-off-by: Byoungro So <byoungro.so@intel.com>
- Removes printing to cout from check for ext_oneapi_graph
- Caused garbled output when running sycl-ls and LIT
Adds a templated wrapper to sycl::length in syclcompat, that wraps over
`sycl::vec<ValueT,N>` up to N == 4, and calculates the length with N >
4.

---------

Signed-off-by: Alberto Cabrera <alberto.cabrera@codeplay.com>
…fpga_cluster kernel property (intel#12453)

Implement task_sequence header, task_sequence properties, and
fpga_cluster kernel property according to the spec updates
intel#6348 (almost ready to be merged)
Header
- Refactor invocation and response capacity to be on the Create
intrinsic, so Async and Get do not take in respective argument
- Remove max_outstanding argument of Create intrinsic
- Add pipelined and fpga_cluster arguments to Create intrinsic
- Async, Get, and Release intrinsic now accept the TargetExtType
target("spirv.TaskSequenceINTEL") returned from the Create intrinsic,
and do not take in object pointer or task function pointer arguments
- Task Sequence accepts properties

Task Sequence Properties
- Convert invocation and response capacity from template parameters to
properties
- Add balanced property to remove Get intrinsic loop in destructor

FPGA Kernel properties
- Add fpga_cluster property
- Support fpga_cluster and pipelined property for task sequence
…#12989)

Test UR PR oneapi-src/unified-runtime#1426 that
default initializes CUDA_KERNEL_NODE_PARAMS to prevents errors from
trying to set data members which are not present before version 12.0

---------

Co-authored-by: Kenneth Benzie (Benie) <k.benzie83@gmail.com>
This PR adds multiple binary and unary ops through callable structs:

- abs 
- abs_diff 
- add_sat
- rhadd 
- hadd 
- maximum
- minimum
- sub_sat

It also adds `vectorized_sum_abs_diff`.

To facilitate testing, it also incorporates a set of fixtures/helper
classes to simplify future additions to math.hpp
…ntel#12977)

Adds helpers to calculate the occupancy of the GPU

Testing currently only addresses API.
kbenzie and others added 7 commits March 13, 2024 09:10
Enable more tests. The rest takes too much time. The current set takes
~50min to build and run.
See:
https://github.com/intel/llvm/actions/runs/8247563907/job/22555980586
Depends on intel#12965 merging first.

---------

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Treat warnings as errors, and fix the last remaining warnings in the
fusion library.

---------

Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
- Fix access modes not being respected and creating unnecessary edges
- Update printing E2E tests since output has changed
- Add unit tests for access modes
@Bensuo
Copy link
Collaborator Author

Bensuo commented Mar 13, 2024

Closed in favour of upstream PR: intel#13011

@Bensuo Bensuo closed this Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Graph Implementation Related to DPC++ implementation and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.