Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Address L0 leaks in command buffer extension (#191) #229

Closed
wants to merge 46 commits into from

Conversation

mfrancepillois
Copy link
Collaborator

Release all the events and objects allocated by the command-buffer creation and enqueuing processes.
Add a second run to each test to check the non-occurrence memory leaks.

Fixes Issue #191

Bensuo and others added 30 commits May 2, 2023 18:09
commit 2348227
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Wed Apr 19 14:48:17 2023 +0100

    [SYCL] Update graph constructor/finalize to current spec (#140)

    - Add device and context params to graph constructor
    - Remove context from finalize
    - Minor changes to graph_impl to support this
    - Update all examples to use updated API
    - Tidied up ordering of graph_impl declarations a little

commit 7e580c5
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Wed Apr 19 13:46:52 2023 +0100

    [SYCL] Fix subgraphs, move sync points to exec graph (#134)

    * [SYCL] Fix subgraphs, move sync points to exec graph

    - Fixes subgraph support for command buffer graphs
    - Move sync points to executable graph instead of node
    - Removed unused graph impl from nodes
    - Kernel dims are now correctly reversed before submission with dims > 1
    - Remove unnecessary call to piEventCreate

commit 2f75c88
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Thu Apr 13 12:48:40 2023 +0100

    [SYCL] Replace lazy queue property with PI command-buffers. (#100)

    - Remove lazy queue property
    - Use command buffers inside graphs for execution
    - Separate executable graph impl from modifiable graph impl
    - Implement handler::depends_on for record and replay nodes
    - New test for finalizing different graphs from the same modifiable one
    - graph-record-dotp now uses handler::depends_on
    - Implement arg filtering before setting args
    - Make applyFuncOnFilteredArgs accessible from commands.hpp
    - Track dependencies through empty nodes in graphs
    - Guard reduction use in device mem example
    - Fix issues with empty node example
    - Guard command buffer behind SYCL_EXT_ONEAPI_GRAPH
    - Recreate simple submission in emulation mode

    ---------

    Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>

commit 33d64f9
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri Mar 31 12:46:04 2023 -0500

    [SYCL] Add empty node implementation (#112)

    Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>

commit 187c9d0
Merge: ec71841 7d4e315
Author: Julian Miller <julian.miller@intel.com>
Date:   Thu Mar 30 18:21:03 2023 +0200

    Merge pull request #115 from reble/julianmi/graph-testing-waits

    Graph Testing: Add missing waits and USM device tests

commit ec71841
Merge: 1efde99 9b95a70
Author: Julian Miller <julian.miller@intel.com>
Date:   Thu Mar 30 18:20:44 2023 +0200

    Merge pull request #71 from reble/julianmi/graph-emulation-macro

    Guard SYCL Graph implementation and fallback emulation

commit 7d4e315
Author: Julian Miller <julian.miller@intel.com>
Date:   Wed Mar 29 12:05:55 2023 +0200

    Add USM device graph test

commit 6b89b23
Author: Julian Miller <julian.miller@intel.com>
Date:   Wed Mar 29 12:04:50 2023 +0200

    Add missing waits in graph tests

commit 9b95a70
Author: Julian Miller <julian.miller@intel.com>
Date:   Tue Mar 28 19:15:55 2023 +0200

    Remove unneeded includes

commit e285e0a
Author: Julian Miller <julian.miller@intel.com>
Date:   Wed Mar 22 17:43:00 2023 +0100

    Add compiler configuration instructions for SYCL Graph

commit 5f31bfa
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Wed Mar 1 14:59:14 2023 -0600

    Update README.md

commit 7370c0b
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Wed Mar 1 08:48:16 2023 -0600

    Update README.md

    add first draft of landing page

commit e5f4da8
Author: Julian Miller <julian.miller@intel.com>
Date:   Tue Mar 21 17:27:01 2023 +0100

    Remove guarded members

commit 26b24a9
Author: Julian Miller <julian.miller@intel.com>
Date:   Mon Mar 13 17:25:43 2023 +0100

    Add feature test macro

commit 152ccea
Author: Julian Miller <julian.miller@intel.com>
Date:   Fri Jan 20 18:06:55 2023 +0100

    Guard SYCL Graph implementation and fallback emulation

commit 1efde99
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Thu Mar 23 12:26:50 2023 +0000

    [SYCL] Remove CGF reuse in graph nodes

    - Note reductions are broken by this commit due to missing accessor support
    - Handler info is extracted and copied into nodes
    - Adding nodes in record and replay moved to finalize.
    - Workarounds for reduction wg sizes added.
    - Introduce `graph-record-temp-scope.cpp` test case which fails before this commit and passes afterwards.

    Instead of USM arguments, it is buffer accessors that should be used for
    edge detection. Fixes `graph-explicit-node-ordering.cpp` test ordering which is currently
    creating incorrect extra edges

    Also added `graph-explicit-dotp-buffer.cpp` test for explicit API with accessor edges, we can use to see if this
    logic works once accessors are better supported.

    This change adds a new handler constructor which takes
    a graph, rather than creating a default temporary queue object
    to pass to the existing constructor.

    Co-authored-by: Ewan Crawford <ewan@codeplay.com>

commit b7f17c8
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Tue Mar 21 08:15:57 2023 +0000

    [SYCL] Update record & replay tests

    Update the record & replay tests to match changes from
    #72 which were missed after
    merging the record and replay branch:

    * Remove unused headers
    * Uses asserts instead of printing to std out

commit d2ff468
Author: Julian Miller <julian.miller@intel.com>
Date:   Thu Mar 16 10:08:27 2023 +0100

    [SYCL] Improve Graphs testing

    * Extend testing

    * Fix reduction test

    * Add test to verify node ordering

    * Update sycl include

    * Switch to assertions in graph tests

    * Formatting

commit 068dd95
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Mon Mar 13 11:14:29 2023 -0500

    Resolving naming style mismatch (#86)

commit 66d1b6b
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Thu Mar 2 23:54:48 2023 -0600

    Improve code location and replace shared ptr aliases (#82)

commit 62d6b15
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Tue Feb 28 10:53:46 2023 +0000

    [SYCL][PI] Prototype command_buffer API in level zero

    - Adds a prototype of an explicit command buffer
    - Implemented only for level zero backend
    - Unit tests added which test new entry points.

commit d4c1ed3
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Mon Feb 27 08:48:23 2023 +0000

    [SYCL] Record & Replay Implementation

    Implementation of Record & Replay API with tests

    Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>

commit 06c588f
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Thu Feb 9 10:53:47 2023 -0600

    Apply suggestions from code review

    Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>

commit 0ac7a7e
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Thu Jan 19 10:29:46 2023 -0600

    Adding new example using make edge function (#63)

    Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>

commit 1249fbc
Author: Ewan Crawford <ewan.cr@gmail.com>
Date:   Thu Jan 19 10:03:56 2023 +0000

    [SYCL] Pass property_list to APIs

    Adds the `sycl::property_list` to the constructor of
    `command_graph<modifiable>()` and `finalize()` to
    match spec change #67

commit 4a306ed
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Wed Jan 11 10:53:16 2023 +0000

    [SYCL] Add unit tests for command graph POC

    - Add some unit tests for the command graph POC
    -Add missing specializations for lazy queue property

commit fb28d59
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Mon Jan 9 11:10:26 2023 +0000

    [SYCL] Rename exec_graph to ext_oneapi_graph

    [SYCL] handler::ext_oneapi_graph

    Update to reflect changes from #65

    - In line with recent spec changes, rename handler and queue shortcut functions from exec_graph to ext_oneapi_graph
    - Also updated usage in the examples

    Co-authored-by: Ewan Crawford <ewan@codeplay.com>

commit 1448cb5
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Wed Dec 21 09:10:40 2022 +0000

    [SYCL] Enable submitting sub-graphs

    - Enable submitting a sub-graph as part of a larger command_graph
    - Flag added to queue_impl to enable graph to be aware it is a sub-graph and delay flush
    - Adds an example whichuses a subgraph in the middle of a command_graph

commit c99bdca
Author: Ben Tracy <ben.tracy@codeplay.com>
Date:   Tue Dec 13 10:57:15 2022 +0000

    [SYCL] Fix reductions not working inside graph

    * Graph submission now properly creates a host visible event on the command list allowing auxilliary resources to be cleaned up

    * executeCommandList slightly modified to block execution only for command lists not allowed to be batched.

commit 3073cfc
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Fri Dec 2 10:47:32 2022 +0000

    [SYCL] Clean-up lazy queue PI changes

    * PI Minor version bump for new flag
    * Document new PI property as comments
    * Make value next consecutive bit `1 << 5`, rather
      than `1 << 11`.

commit 7bb11ce
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Wed Nov 30 13:14:50 2022 +0000

    [SYCL] Use handler to execute graph

    Update API to match the spec change from #26
    to execute a graph via the handler rather than queue submit.

    This spec update includes queue shortcut functions, which i've added
    a new test for.

commit 578692f
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Thu Nov 24 09:26:27 2022 +0000

    [SYCL] PIMPL refactor

    Refactor the command_graph and node classes so that
    we interface with the impl types rather than
    user exposed types, and just the interface lives in the
    public facing headers.

    This change also means we can use a `.cpp` file for implementation
    code rather than being header only.

    The motivation for these changes was trying to get graph submission
    through a handler, at which point only the `sycl::detail::queue_impl` class
    is available rather than `sycl::queue`

commit 9f127d7
Author: Ewan Crawford <ewan@codeplay.com>
Date:   Fri Nov 18 16:27:54 2022 +0000

    [SYCL] Repro for reduction fail

    * Add RUN lines to tests so that tests are run by LIT
    * clang-format existing tests, and other minor cleanups
    * Add `graph-explicit-reduction.cpp` which shows fail from #24 by using the `sycl::ext::oneapi::property::queue::lazy_execution` property on a queue which uses a reduction outwith  the graph building API

commit 2cf9d0f
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Nov 29 21:26:28 2022 -0600

    Cosmetic changes

commit df971e5
Author: Ben Tracy <benatracy@gmail.com>
Date:   Thu Nov 24 08:46:12 2022 +0000

    [SYCL] Minor graph classes refactor (#36)

    - getSyclObjImpl and createSyclObjFromImpl support added
    - Minor renaming to enable this.
    - Adds basic results validation to dotp test
    - Minor fixes to address warnings etc.

commit f71ea49
Author: Ewan Crawford <ewan.cr@gmail.com>
Date:   Mon Nov 21 12:25:44 2022 +0000

    Common changes from record & replay API (#32)

    Changes to common code from #6
    which has already been reviewed and merged into the
    `sycl-graph-record-replay` branch.

    This patch should not contain anything specific to the record and
    replay API.

commit 383459c
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Nov 1 13:35:42 2022 -0500

    Renaming variables

commit 4478390
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Nov 1 12:45:31 2022 -0500

    clang-format

commit fa58aa3
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Wed Oct 19 20:16:21 2022 -0700

    renaming macro and bugfix

commit 38da3c6
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Oct 18 07:49:47 2022 -0700

    add basic tests

commit 7581915
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Oct 18 07:40:15 2022 -0700

    bugfix

commit fa7494d
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Oct 18 07:39:19 2022 -0700

    starting to rework lazy execution logic

commit 446ac53
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Oct 18 07:37:41 2022 -0700

    revert changes to level-zero plugin

commit 8850b18
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Wed Oct 12 11:33:57 2022 -0700

    fix rebase issue

commit a3164de
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Wed Oct 12 08:03:55 2022 -0700

    update API to recent proposal

commit 7917086
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue May 10 11:25:51 2022 -0500

    fix formatting

commit 7d81618
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri May 6 11:54:58 2022 -0500

    fix issue introd. by recent merge

commit 9b46c4b
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri May 6 10:30:29 2022 -0500

    fix formatting issues

commit 50d49a1
Author: Julian Miller <julian.miller@intel.com>
Date:   Tue May 3 11:29:34 2022 -0500

    Propagate lazy queue property

commit 0d8a5f4
Author: Pablo Reble <pablo@reble.org>
Date:   Mon Mar 14 14:08:02 2022 +0100

    Apply suggestions from code review

    Co-authored-by: Ronan Keryell <ronan@keryell.fr>

commit f957996
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Mon May 2 21:06:42 2022 -0500

    fix typos and syntax issues

commit 047839b
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri Mar 11 20:47:16 2022 +0100

    typo

commit 2b50af4
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri Mar 11 16:42:43 2022 +0100

    update extension proposal started to incorporate feedback

commit a8b5b32
Author: Pablo Reble <pablo@reble.org>
Date:   Tue Feb 22 10:46:54 2022 -0600

    Update pi_level_zero.cpp

    Fix merge conflict

commit 0bad787
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Mon Feb 21 22:25:38 2022 -0600

    fix merge

commit 656f5c3
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Tue Feb 15 17:18:32 2022 -0600

    Adding lazy execution property to queue

commit d286c71
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri Feb 18 15:15:10 2022 -0600

    Adding initial sycl graph doc

commit 1acf57e
Author: Pablo Reble <pablo.reble@intel.com>
Date:   Fri Feb 18 15:16:27 2022 -0600

    Inital version of sycl graph prototype
* bringing all tests to new API for dependencies

* add trivial test and note on imp. status

---------

Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>
* Consistently declare `class graph_impl` due to issue
```
In file included from /home/ewan/Development/dpc++/build_release/bin/../include/sycl/queue.hpp:21:
/home/ewan/Development/dpc++/build_release/bin/../include/sycl/handler.hpp:84:1: error: class 'graph_impl' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class graph_impl;
^
/home/ewan/Development/dpc++/build_release/bin/../include/sycl/ext/oneapi/experimental/graph.hpp:29:8: note: previous use is here
struct graph_impl;
```

* Use updated constructor in Mock testing
* Declare `mock_*` functions which lead to compilation failures when
  missing
* Forward declare `sycl::device` in graph.hpp header, otherwise there
  are issues if we include it before <sycl/sycl.hpp>
Option to enable graphs starts with a double `-` rather than a single one.
Using a `handler::fill()` operation in nodes
works without any further modifications because
the DPC++ runtime implements it as a `parallel_for` which
we already support.

See
[handler.hpp](https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/handler.hpp#L2453-L2467)
```cpp
  /// Fills the specified memory with the specified pattern.
  ///
  /// \param Ptr is the pointer to the memory to fill
  /// \param Pattern is the pattern to fill into the memory.  T should be
  /// trivially copyable.
  /// \param Count is the number of times to fill Pattern into Ptr.
  template <typename T> void fill(void *Ptr, const T &Pattern, size_t Count) {
    throwIfActionIsCreated();
    static_assert(std::is_trivially_copyable<T>::value,
                  "Pattern must be trivially copyable");
    parallel_for<class __usmfill<T>>(range<1>(Count), [=](id<1> Index) {
      T *CastedPtr = static_cast<T *>(Ptr);
      CastedPtr[Index] = Pattern;
    });
  }
```

Not that there is a different fill implementation for buffer accessors,
which this PR doesn't cover, but is tracked as work in
[issue 146](#146)

Closes #149

Co-authored-by: Pablo Reble <pablo.reble@intel.com>
We can go direct from the CMake option to  `feature_test.hpp.in`,
Kernel-fusion also does this.
* [SYCL] Graphs accessor support

- Enqueue to command buffer through scheduler when required
- New command type for enqueueing cmd buffer
- Sync point stored in event impl
- Removed capturing requirements from handler
- Add RT:: aliases for command buffer types
- Update usage in graph implementation
- Add test mixing buffers and USM
- Refactor command creation and enqueue from handler
- Emulation mode uses refactored function to enqueue to scheduler correctly
- execCGCommand and Command class are now command buffer aware

---------

Co-authored-by: Ewan Crawford <ewan@codeplay.com>
As well as the doxygen comment, there are some additional changes I found along the way:
* Pass queue by reference to `begin_recording()` & `end_recording()`
* Pass node by reference to `make_edge()`
* Add stubbed out `update()` entry-point
* Remove unused `MParent` variable

Co-authored-by: Julian Miller <julian.miller@intel.com>
* Adding deduction guide for command_graph

* fix mismatch struct vs. class impl declarations.

---------

Co-authored-by: Ewan Crawford <ewan@codeplay.com>
* using properties for passing dependencies
Moved from https://github.com/Bensuo/llvm-test-suite/tree/intel/SYCL/CommandGraph with updates:

* Create explicit API equivalents of record & replay tests
* Use asserts rather than return an error code, which is more in-keeping with the other e2e tests.
* Prefer USM to buffers when it's not the focus of the test
* Avoid buffer write-back behaviour
* Disable tests with non-predictable results, which can't be marked XFAIL as they sometimes pass.
* Set ordering dependency on graph submissions

To enable the tests I built with `cmake -DSYCL_TEST_E2E_TARGETS="level_zero:gpu;opencl:gpu"`
and ran the tests with `./bin/llvm-lit --param sycl_be=level_zero --param target_devices=gpu -sv
sycl/test-e2e/Graph/`
* Update handler layout with new alignment in cpp files

* Update available symbol references in dump files
An event returned from a queue submission captured by Record & Replay should be able to create an edge to a node created by the Explicit API.

This edge is defined by passing the event to `handler::depends_on` inside the command-group added explicitly to the graph.

Closes #89
We've treating these tests are runtime tests, so move them to test-e2e which the following changes:

* Prefer device USM to shared USM. Device USM support is mandatory while shared is optional

* Change `TestQueue` variable name to `Queue`

* Comment new tests and remove buffer copy back behaviour.

* Fix test name mismatch between recording & explicit.

* Rename `Explicit/whole_graph_update_ordering.cpp` -> `Explicit/executable_graph_update_ordering.cpp`

* Introduce a record & replay saxypy test.

* Add more host, shared, and system USM tests.

* Move dotp reductions to their own tests
Implements query for queue state from #162

Adds symbol for queue query entry-point and update test to use test-e2e.

Closes #93
- Node class now wraps a command group object
- Some changes to handler::finalize to support this
- Remove refactored handler CG creation
- Minor changes to CG classes to support copying
- Simplify enqueueImpCommandBufferKernel parameters
- Move graph execution to handler::finalize
- Add asserts in getCGCopy for host tasks
- Empty nodes have no args so prevent calling has_arg() for them.
-  Fixes failure in RecordReplay/sub_graph_reduction
Rather than the user deciding whether to use the emulation mode
or backend command-buffer mode, it is less error prone to do it
programmatically as a user can't select an unsupported config.

This also fixes #90 where currently
the `SYCL_EXT_ONEAPI_GRAPH` macro isn't defined when in emulation mode.

Done by using a PI device info query in finalization. This
also corresponds to a new UR device info query until the extension mechanism is
decided oneapi-src/unified-runtime#458
but will be superceeded by whatever extension reporting mechanism is
decided on.

The only PI backend which reports support for command-buffer
implementation is Level Zero, the other backends/adapters report no
support.

The vendor test macro test-e2e test is re-enabled with this change.
- Update usage of plugin to use PluginPtr
- Update e2e tests to use ${build} and ${run}
Implement the device info `graph_support` query defined
by spec PR #178

This only reports that graphs are supported on Level Zero
devices.
…it tests

The check-sycl target currently fails to compile our PI unit tests for Level Zero that test PI command-buffers. This exhibits as a fail on Windows CI, but is not a Windows specific issues, and can be recreated on Linux.

The initial set of build fails are due to our test failing to be able to find required headers, this is because Unified Runtime and Level Zero headers are transitively included without pointing to the location of the headers. I managed to work around this by modifying the CMakeLists.txt file of our test to add these include directories (see the first comment of the branch).

After working around this header include issue, I encountered other issues from the UR code that maps Level Zero to UR (sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero.hpp) Rather than starting to modifying this code which we don't own, and will soon disappear, instead I'm proposing that we remove these tests for now since PI is being removed (second commit of this branch). Then re-add the unit tests once there is UR unit test suite, as these tests are still in our git history to revive and port.
We'll eventually need to add a table of current implementation status to our
extension specification.

Create a table in our development branch readme first,
allowing us to update it when new features are added, and making it
easy to port over to the specification when the time arises.

Also includes a table of backend support.

Co-authored-by: Pablo Reble <pablo.reble@intel.com>
- Add support for USM memcpy and copy in graphs
- New MemoryManager method for usm copies
- New PI extension methods for enqueueing USM copies
- Re-enabled and fixed issues in E2E tests for copies
- Minor changes to queue_impl to capture copies correctly
[SYCL][Graph] Add in-order queue test
This stub is necessary to compile the unittests for the
`check-sycl` target, fixes build error:
```
error C2065: 'mock_piextCommandBufferMemcpyUSM': undeclared identifier
```
Update test for adding new symbols to dll using python script, and
symbol alignment.
- Formatting fixes
- ABI symbol check updates
- Small compilation issues fixed
EwanC and others added 16 commits May 31, 2023 08:04
Remove flag from getting sarted guide as well

These were accidentally omitted from #161
Fix for `basic_tests/stdcpp_compat.cpp` in the `check-sycl`
target on windows, which fails when a compilation error
is triggered when building with different command-line flags

This issue was arising from the `depends_on` class using the
`node` class in the `MDeps` vector, when `node` was only declared
and not defined. This lead to issues when the implicit destructor
for `depends_on` was created before the `node` class was defined,
triggering the error picked up by the test.

Fixed by moving the definition of the `node` class before `depends_on`
so that we don't need the declaration.

Also removes unused graph member from node class
The current approach uses the same LO PI event for all submissions
of the same graph, and also doesn't use the wait events to
enforce dependencies on the command-list submission. By doing these
in the L0 adapter, we can remove the blocking queue wait from
our graphs submission code in the runtime.

Closes issue #139
- Rework arg checking for nodes to use requirements instead
- New PI functions for cmd buffer mem copies
- New Memory::Manager methods for buffer copies
- Update ExecCGCommand to handle buffer copies
- New e2e tests for buffer copies
When a user adds a subgraph we currently link all the nodes
in a subgraph to the parent graph. This causes issues in the
following cases:

* User submits sub-graph object to execute standalone, as it
  exit nodes now have sucessors to a parent graph
* Sub-graph object is added to another graph as a node (or the same
  graph again), as its original root nodes now have predecessors.

Documented these limitations in the README and created ticket
#201 to resolve this.

Actions #142 as this also
has a test for nested sub-graphs, which do work.
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>
PR #177 updated the public
handler class to add new members, without updating the
`layout_handler.cpp` ABI test.

This patch updates that test to reflect this change.
Fixes scheduer unit test caught by CI, this assumes that
the second element in the binary blob is a pointer to a
`std::unqiue_ptr`, but it is actually just a raw pointer
after our changes
[here](https://github.com/reble/llvm/blob/sycl-graph-develop/sycl/source/detail/scheduler/commands.cpp#L2655)

I've also made comment on our upstream PR about another place
that could be affected but haven't seen regress in  a test yet
https://github.com/intel/llvm/pull/9728/files#r1222909672
* [SYCL][Graph] Fix empty node issue

Add empty to graph for record & replay

Fixes issue #208

* [SYCL][Graph] Fix empty node issue

Addresses reviewer comments on PR: #213

* Update sycl/test-e2e/Graph/RecordReplay/empty.cpp

Co-authored-by: Ben Tracy <ben.tracy@codeplay.com>

---------
Implements using Record & Replay graph construction
to record an in-order queue and keep linear dependencies
between nodes.

Done by storing in the graph the last node added to the
graph from each in-order queue, and using that to create
a dependency edge on any new nodes added from that in-order
queue.

Actions #188
* [SYCL][Graphs] Implement UR command-buffers

- Move implementation of PI command buffers to UR
- Command-buffer support is now done through the PI/UR extension query
- Only level zero backend reports extension
- Fix handler not checking whether we're in explicit mode when processing empty CG
- Using a temporary UR commit because upstream is out of date with recent changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants