From 71895868b34f40e54fcd77c81f78c08016a2ffed Mon Sep 17 00:00:00 2001 From: Martin Morrison-Grant Date: Mon, 23 Oct 2023 16:25:07 +0100 Subject: [PATCH 1/6] [SYCL][OpenCL] Update E2E Graph tests to run using OpenCL, and new helper function to return early if Graphs are not supported by the device. Added OpenCL section to CommandGraph docs. Co-authored-by: Pablo Reble --- sycl/doc/design/CommandGraph.md | 61 ++++++++++++++++++- sycl/plugins/unified_runtime/CMakeLists.txt | 14 ++--- sycl/plugins/unified_runtime/pi2ur.hpp | 5 +- sycl/test-e2e/Graph/Explicit/cycle_error.cpp | 8 +++ .../executable_graph_update_ordering.cpp | 2 +- .../test-e2e/Graph/Inputs/buffer_ordering.cpp | 1 - .../RecordReplay/dotp_multiple_queues.cpp | 1 - .../executable_graph_update_ordering.cpp | 2 +- sycl/test-e2e/Graph/device_query.cpp | 4 +- 9 files changed, 83 insertions(+), 15 deletions(-) diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index 9b57dd51c0a9a..755af9939a33e 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -149,7 +149,7 @@ yet been implemented. Implementation of UR command-buffers for each of the supported SYCL 2020 backends. -Currently Level Zero and CUDA backends are implemented. +Backends which are implemented currently are: Level Zero, CUDA and OpenCL. More sub-sections will be added here as other backends are supported. ### Level Zero @@ -246,3 +246,62 @@ the executable CUDA Graph that represent this series of operations. An executable CUDA Graph, which contains all commands and synchronization information, is saved in the UR command-buffer to allow for efficient graph resubmission. + +### OpenCL + +Command Buffers are defined in the OpenCL spec in the [cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer) extension. + +There are some gaps in both the OpenCL and UR specifications for Command +Buffers shown in the list below. There are implementations in the UR OpenCL +adapter where there is matching support for each function in the list. + +| UR | OpenCL | Supported | +| --- | --- | --- | +| urCommandBufferCreateExp | clCreateCommandBufferKHR | Yes | +| urCommandBufferRetainExp | clRetainCommandBufferKHR | Yes | +| urCommandBufferReleaseExp | clReleaseCommandBufferKHR | Yes | +| urCommandBufferFinalizeExp | clFinalizeCommandBufferKHR | Yes | +| urCommandBufferAppendKernelLaunchExp | clCommandNDRangeKernelKHR | Yes | +| urCommandBufferAppendUSMMemcpyExp | | No | +| urCommandBufferAppendUSMFillExp | | No | +| urCommandBufferAppendMembufferCopyExp | clCommandCopyBufferKHR | Yes | +| urCommandBufferAppendMemBufferWriteExp | | No | +| urCommandBufferAppendMemBufferReadExp | | No | +| urCommandBufferAppendMembufferCopyRectExp | clCommandCopyBufferRectKHR | Yes | +| urCommandBufferAppendMemBufferWriteRectExp | | No | +| urCommandBufferAppendMemBufferReadRectExp | | No | +| urCommandBufferAppendMemBufferFillExp | clCommandFillBufferKHR | Yes | +| urCommandBufferEnqueueExp | clEnqueueCommandBufferKHR | Yes | +| | clCommandBarrierWithWaitListKHR | No | +| | clCommandCopyImageKHR | No | +| | clCommandCopyImageToBufferKHR | No | +| | clCommandFillImageKHR | No | +| | clGetCommandBufferInfoKHR | No | +| | clCommandSVMMemcpyKHR | No | +| | clCommandSVMMemFillKHR | No | + +Many of the OpenCL functions take a `cl_command_queue` parameter which is not +present in most of the UR functions. Instead, when a new command buffer is +created in `urCommandBufferCreateExp` we also create and maintain a new +internal `ur_queue_handle_t` with a reference stored inside of the +`ur_exp_command_buffer_handle_t_` struct. This internal queue is then used with +the various append functions. The internal queue is retained and released +whenever the owning command buffer is retained or released. + +With command buffers being an OpenCL extension, each function is accessed by +loading a function pointer to its implementation. These are defined in a common +header file in the UR OpenCL adapter. The symbols for the functions are however +defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h) but it is not known at this time what version of the headers will be used in the UR GitHub CI configuration, so loading the function +pointers will be used until this can be verified. A future piece of work would +be replacing the custom defined symbols with the ones from OpenCL-Headers. + +The `UR_DEVICE_INFO_EXTENSIONS` enum can be used with `urDeviceGetInfo` to +query if a specified device supports OpenCL command buffers. This will append +`ur_exp_command_buffer` to a string pointer passed to the function if the +extension is supported. + +Known implementations of cl_khr_command_buffer: +- [OneAPI-Construction-Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building) +- [PoCL](http://portablecl.org/) +- [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu) + diff --git a/sycl/plugins/unified_runtime/CMakeLists.txt b/sycl/plugins/unified_runtime/CMakeLists.txt index 4f9a37c526d63..f58bababc20a2 100644 --- a/sycl/plugins/unified_runtime/CMakeLists.txt +++ b/sycl/plugins/unified_runtime/CMakeLists.txt @@ -56,14 +56,14 @@ endif() if(SYCL_PI_UR_USE_FETCH_CONTENT) include(FetchContent) - set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git") - # commit 31b654f981f6098936e7f04c65803395a2ea343a - # Merge: 71957e84 3da21336 + set(UNIFIED_RUNTIME_REPO "https://github.com/martygrant/unified-runtime.git") + # commit ec7982bac6cb3a6b9ed610cd6b7cb41fcbc780dc + # Merge: 62e6d2f9 5fb82924 # Author: Kenneth Benzie (Benie) - # Date: Wed Nov 22 11:27:33 2023 +0000 - # Merge pull request #1053 from jandres742/url0leakkey - # [UR][L0] Add UR_L0_LEAKS_DEBUG key - set(UNIFIED_RUNTIME_TAG 31b654f981f6098936e7f04c65803395a2ea343a) + # Date: Wed Nov 8 13:32:46 2023 +0000 + # Merge pull request #1022 from 0x12CC/l0_usm_error_checking_2 + # [UR][L0] Propagate OOM errors from `USMAllocationMakeResident` + set(UNIFIED_RUNTIME_TAG martin/openclCommandBuffers) if(SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO) set(UNIFIED_RUNTIME_REPO "${SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO}") diff --git a/sycl/plugins/unified_runtime/pi2ur.hpp b/sycl/plugins/unified_runtime/pi2ur.hpp index 2ae10d0ab419e..e12184677a00b 100644 --- a/sycl/plugins/unified_runtime/pi2ur.hpp +++ b/sycl/plugins/unified_runtime/pi2ur.hpp @@ -127,7 +127,6 @@ static pi_result ur2piResult(ur_result_t urResult) { return PI_ERROR_INVALID_WORK_DIMENSION; case UR_RESULT_ERROR_INVALID_GLOBAL_WIDTH_DIMENSION: return PI_ERROR_INVALID_VALUE; - case UR_RESULT_ERROR_PROGRAM_UNLINKED: return PI_ERROR_INVALID_PROGRAM_EXECUTABLE; case UR_RESULT_ERROR_OVERLAPPING_REGIONS: @@ -140,6 +139,10 @@ static pi_result ur2piResult(ur_result_t urResult) { return PI_ERROR_OUT_OF_RESOURCES; case UR_RESULT_ERROR_ADAPTER_SPECIFIC: return PI_ERROR_PLUGIN_SPECIFIC_ERROR; + case UR_RESULT_ERROR_INVALID_COMMAND_BUFFER_EXP: + return PI_ERROR_INVALID_COMMAND_BUFFER_KHR; + case UR_RESULT_ERROR_INVALID_COMMAND_BUFFER_SYNC_POINT_WAIT_LIST_EXP: + return PI_ERROR_INVALID_SYNC_POINT_WAIT_LIST_KHR; case UR_RESULT_ERROR_UNKNOWN: default: return PI_ERROR_UNKNOWN; diff --git a/sycl/test-e2e/Graph/Explicit/cycle_error.cpp b/sycl/test-e2e/Graph/Explicit/cycle_error.cpp index 762cf4fdcec63..19117bcf7a1bd 100644 --- a/sycl/test-e2e/Graph/Explicit/cycle_error.cpp +++ b/sycl/test-e2e/Graph/Explicit/cycle_error.cpp @@ -80,6 +80,14 @@ void CreateGraphWithCyclesTest(bool DisableCycleChecks) { } int main() { + { + queue Queue; + + if (!are_graphs_supported(Queue)) { + return 0; + } + } + // Test with cycle checks CreateGraphWithCyclesTest(false); // Test without cycle checks diff --git a/sycl/test-e2e/Graph/Explicit/executable_graph_update_ordering.cpp b/sycl/test-e2e/Graph/Explicit/executable_graph_update_ordering.cpp index 18d3b2ddf65d5..1fa175718e54d 100644 --- a/sycl/test-e2e/Graph/Explicit/executable_graph_update_ordering.cpp +++ b/sycl/test-e2e/Graph/Explicit/executable_graph_update_ordering.cpp @@ -11,4 +11,4 @@ #define GRAPH_E2E_EXPLICIT -#include "../Inputs/executable_graph_update_ordering" +#include "../Inputs/executable_graph_update_ordering.cpp" diff --git a/sycl/test-e2e/Graph/Inputs/buffer_ordering.cpp b/sycl/test-e2e/Graph/Inputs/buffer_ordering.cpp index 2cfe245c9e423..fa35722cc4a79 100644 --- a/sycl/test-e2e/Graph/Inputs/buffer_ordering.cpp +++ b/sycl/test-e2e/Graph/Inputs/buffer_ordering.cpp @@ -12,7 +12,6 @@ #include "../graph_common.hpp" int main() { - queue Queue{{sycl::ext::intel::property::queue::no_immediate_command_list{}}}; if (!are_graphs_supported(Queue)) { diff --git a/sycl/test-e2e/Graph/RecordReplay/dotp_multiple_queues.cpp b/sycl/test-e2e/Graph/RecordReplay/dotp_multiple_queues.cpp index c7695ce22dde0..977b008e8626b 100644 --- a/sycl/test-e2e/Graph/RecordReplay/dotp_multiple_queues.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/dotp_multiple_queues.cpp @@ -10,7 +10,6 @@ #include "../graph_common.hpp" int main() { - property_list Properties{ property::queue::in_order{}, sycl::ext::intel::property::queue::no_immediate_command_list{}}; diff --git a/sycl/test-e2e/Graph/RecordReplay/executable_graph_update_ordering.cpp b/sycl/test-e2e/Graph/RecordReplay/executable_graph_update_ordering.cpp index 1d1244bb4add4..0f00572fb19ea 100644 --- a/sycl/test-e2e/Graph/RecordReplay/executable_graph_update_ordering.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/executable_graph_update_ordering.cpp @@ -11,4 +11,4 @@ #define GRAPH_E2E_RECORD_REPLAY -#include "../Inputs/executable_graph_update_ordering" +#include "../Inputs/executable_graph_update_ordering.cpp" diff --git a/sycl/test-e2e/Graph/device_query.cpp b/sycl/test-e2e/Graph/device_query.cpp index 51746ae629885..ba492e1435b91 100644 --- a/sycl/test-e2e/Graph/device_query.cpp +++ b/sycl/test-e2e/Graph/device_query.cpp @@ -1,4 +1,4 @@ -// REQUIRES: cuda || level_zero, gpu +// REQUIRES: opencl || cuda || level_zero // RUN: %{build} -o %t.out // RUN: %{run} %t.out @@ -21,7 +21,7 @@ int main() { auto Backend = Device.get_backend(); if ((Backend == backend::ext_oneapi_level_zero) || - (Backend == backend::ext_oneapi_cuda)) { + (Backend == backend::ext_oneapi_cuda) || (Backend == backend::opencl)) { assert(SupportsGraphs == exp_ext::graph_support_level::native); } else { assert(SupportsGraphs == exp_ext::graph_support_level::unsupported); From 3244eb2f5c4518fcf37867e7ac2f38cb7192da3c Mon Sep 17 00:00:00 2001 From: Ewan Crawford Date: Tue, 14 Nov 2023 10:32:39 +0000 Subject: [PATCH 2/6] Update Documentation with OpenCL limiations Describe user facing limitations of OpenCL graphs backend and document failing tests this directly affects. --- sycl/doc/design/CommandGraph.md | 70 ++++++++++++++----- .../design/images/SYCL-Graph-Architecture.svg | 2 +- .../Explicit/buffer_copy_host2target.cpp | 3 + .../Explicit/buffer_copy_host2target_2d.cpp | 3 + .../buffer_copy_host2target_offset.cpp | 3 + .../Explicit/buffer_copy_target2host.cpp | 3 + .../Explicit/buffer_copy_target2host_2d.cpp | 3 + .../buffer_copy_target2host_offset.cpp | 3 + sycl/test-e2e/Graph/Explicit/usm_copy.cpp | 3 + .../RecordReplay/buffer_copy_host2target.cpp | 3 + .../buffer_copy_host2target_2d.cpp | 3 + .../buffer_copy_host2target_offset.cpp | 3 + .../RecordReplay/buffer_copy_target2host.cpp | 3 + .../buffer_copy_target2host_2d.cpp | 3 + .../buffer_copy_target2host_offset.cpp | 3 + sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp | 3 + .../Graph/RecordReplay/usm_copy_in_order.cpp | 3 + sycl/test-e2e/Graph/device_query.cpp | 7 +- 18 files changed, 104 insertions(+), 20 deletions(-) diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index 755af9939a33e..c84fe57d4f68a 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -149,8 +149,8 @@ yet been implemented. Implementation of UR command-buffers for each of the supported SYCL 2020 backends. -Backends which are implemented currently are: Level Zero, CUDA and OpenCL. -More sub-sections will be added here as other backends are supported. +Backends which are implemented currently are: [Level Zero](#level-zero), +[CUDA](#cuda), and partial support for [OpenCL](#opencl). ### Level Zero @@ -249,9 +249,41 @@ graph resubmission. ### OpenCL -Command Buffers are defined in the OpenCL spec in the [cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer) extension. - -There are some gaps in both the OpenCL and UR specifications for Command +SYCL-Graphs is only enabled for an OpenCL backend when the +[cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer) +extension is available, however this information isn't available until runtime +due to OpenCL implementations being loaded through an ICD. + +The `ur_exp_command_buffer` string is conditionally returned from the OpenCL +command-buffer UR backend at runtime based on `cl_khr_command_buffer` support +to indicate that the graphs extension should be enabled. This is information +is propagated to the SYCL user via the +`device.get_info()` query for graphs extension +support. + +#### Limitations + +Due to the API mapping gaps documented in the following section, OpenCL as a +SYCL backend cannot fully support the graphs API. Instead there are +limitations in the types on nodes which a user can add to a graph, using +an unsupported node type will cause an abort in graph finalization with the +message +`ur_die: Experimental Command-buffer entry point is not implemented for OpenCL adapter.`. + +The types of commands which are unsupported, and lead to this exception are: +* `handler::copy(src, dest)` - Where `src` is an accessor and `dest` is a pointer. + This corresponds to a memory buffer read command. +* `handler::copy(src, dest)` - Where `src` is an pointer and `dest` is an accessor. + This corresponds to a memory buffer write command. +* `handler::copy(src, dest)` or `handler::memcpy(dest, src)` - Where both `src` and + `dest` are USM pointers. This corresponds to a USM copy command. + +Note that `handler::copy(src, dest)` where both `src` and `dest` are an accessor +is supported, as a memory buffer copy command exists in the OpenCL extension. + +#### UR API Mapping + +There are some gaps in both the OpenCL and UR specifications for Command Buffers shown in the list below. There are implementations in the UR OpenCL adapter where there is matching support for each function in the list. @@ -280,27 +312,29 @@ adapter where there is matching support for each function in the list. | | clCommandSVMMemcpyKHR | No | | | clCommandSVMMemFillKHR | No | +#### UR Command-Buffer Implementation + Many of the OpenCL functions take a `cl_command_queue` parameter which is not present in most of the UR functions. Instead, when a new command buffer is -created in `urCommandBufferCreateExp` we also create and maintain a new -internal `ur_queue_handle_t` with a reference stored inside of the -`ur_exp_command_buffer_handle_t_` struct. This internal queue is then used with -the various append functions. The internal queue is retained and released -whenever the owning command buffer is retained or released. +created in `urCommandBufferCreateExp` we also create and maintain a new +internal `ur_queue_handle_t` with a reference stored inside of the +`ur_exp_command_buffer_handle_t_` struct. The internal queue is retained and +released whenever the owning command buffer is retained or released. With command buffers being an OpenCL extension, each function is accessed by loading a function pointer to its implementation. These are defined in a common header file in the UR OpenCL adapter. The symbols for the functions are however -defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h) but it is not known at this time what version of the headers will be used in the UR GitHub CI configuration, so loading the function -pointers will be used until this can be verified. A future piece of work would -be replacing the custom defined symbols with the ones from OpenCL-Headers. +defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h) +but it is not known at this time what version of the headers will be used in +the UR GitHub CI configuration, so loading the function pointers will be used +until this can be verified. A future piece of work would be replacing the +custom defined symbols with the ones from OpenCL-Headers. + +#### Available OpenCL Command-Buffer Implementations -The `UR_DEVICE_INFO_EXTENSIONS` enum can be used with `urDeviceGetInfo` to -query if a specified device supports OpenCL command buffers. This will append -`ur_exp_command_buffer` to a string pointer passed to the function if the -extension is supported. +Publicly available implementations of `cl_khr_command_buffer` that can be used +to enable the graphs extension in OpenCL: -Known implementations of cl_khr_command_buffer: - [OneAPI-Construction-Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building) - [PoCL](http://portablecl.org/) - [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu) diff --git a/sycl/doc/design/images/SYCL-Graph-Architecture.svg b/sycl/doc/design/images/SYCL-Graph-Architecture.svg index f67db81aadbbb..c554391f6544d 100644 --- a/sycl/doc/design/images/SYCL-Graph-Architecture.svg +++ b/sycl/doc/design/images/SYCL-Graph-Architecture.svg @@ -1,4 +1,4 @@ -
Application
Application
SYCL-Graph Extension API
SYCL-Graph Extension API
SYCL Runtime
SYCL Runtime
Unified Runtime + Command Buffer Extension
Unified Runtime + Command Buffer Extension
CUDA
CUDA
NVIDIA GPU
NVIDIA GPU
Level Zero
Level Zero
Intel CPU, GPU, FPGA, ...
Intel CPU, GPU,...
HIP
HIP
OpenCL cl_khr_command_buffer
OpenCL cl_khr_command_b...
AMD GPU
AMD GPU
CPU, GPU, FPGA, ...
CPU, GPU, FPG...
SYCL-Graph Architecture
SYCL-Graph Architecture


Application Layer
Application Layer
Implemented Backend
Implemented Backend
SYCL Runtime
SYCL Runtime
Future Backend Support
Future Backend Support
Legend
Legend
Text is not SVG - cannot display
\ No newline at end of file +
Application
Application
SYCL-Graph Extension API
SYCL-Graph Extension API
SYCL Runtime
SYCL Runtime
Unified Runtime + Command Buffer Extension
Unified Runtime + Command Buffer Extension
CUDA
CUDA
NVIDIA GPU
NVIDIA GPU
Level Zero
Level Zero
Intel CPU, GPU, FPGA, ...
Intel CPU, GPU,...
HIP
HIP
OpenCL cl_khr_command_buffer
OpenCL cl_khr_command_b...
AMD GPU
AMD GPU
CPU, GPU, FPGA, ...
CPU, GPU, FPG...
SYCL-Graph Architecture
SYCL-Graph Architecture


Application Layer
Application Layer
Implemented Backend
Implemented Backend
SYCL Runtime
SYCL Runtime
Future Backend Support
Future Backend Support
Legend
Legend
Partial Backend Support
Partial Backend Support
Text is not SVG - cannot display
\ No newline at end of file diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target.cpp index 436a3f2a19ccc..a0988262ed6e0 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_2d.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_2d.cpp index f7d335cdac801..523e0f840d884 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_2d.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_2d.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_offset.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_offset.cpp index 35d0c5f993622..59787ad84b26d 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_offset.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_host2target_offset.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host.cpp index a0a0eddbdad60..8f202b34a376e 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_2d.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_2d.cpp index e8a53c16d6c9a..97d2bbdbaaf8f 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_2d.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_2d.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_offset.cpp b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_offset.cpp index 46eddac11c147..f964718020728 100644 --- a/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_offset.cpp +++ b/sycl/test-e2e/Graph/Explicit/buffer_copy_target2host_offset.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/Explicit/usm_copy.cpp b/sycl/test-e2e/Graph/Explicit/usm_copy.cpp index 15583b6871329..733974ed79b02 100644 --- a/sycl/test-e2e/Graph/Explicit/usm_copy.cpp +++ b/sycl/test-e2e/Graph/Explicit/usm_copy.cpp @@ -4,6 +4,9 @@ // RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %} // // CHECK-NOT: LEAK +// +// USM copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_EXPLICIT diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target.cpp index 2fb7c6890759d..6b8b45ee7b496 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_2d.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_2d.cpp index 41cf94d4098e1..54ad5900a9f2a 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_2d.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_2d.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_offset.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_offset.cpp index 0bc5dc610b85f..104751c8971d5 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_offset.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_host2target_offset.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Host to device copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host.cpp index 29cf2dc70a62a..5562bc6009f8a 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_2d.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_2d.cpp index e558c77804866..197b2687e1da4 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_2d.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_2d.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_offset.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_offset.cpp index d8a64b86c3dfb..3de72477f925d 100644 --- a/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_offset.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/buffer_copy_target2host_offset.cpp @@ -7,6 +7,9 @@ // // TODO enable cuda once buffer issue investigated and fixed // UNSUPPORTED: cuda +// +// Device to host copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp b/sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp index 25b287456fc78..4f8147c3b2f29 100644 --- a/sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/usm_copy.cpp @@ -4,6 +4,9 @@ // RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %} // // CHECK-NOT: LEAK +// +// USM copy command not supported for OpenCL +// UNSUPPORTED: opencl #define GRAPH_E2E_RECORD_REPLAY diff --git a/sycl/test-e2e/Graph/RecordReplay/usm_copy_in_order.cpp b/sycl/test-e2e/Graph/RecordReplay/usm_copy_in_order.cpp index 30b2f312ebb36..f3043e0bd5341 100644 --- a/sycl/test-e2e/Graph/RecordReplay/usm_copy_in_order.cpp +++ b/sycl/test-e2e/Graph/RecordReplay/usm_copy_in_order.cpp @@ -4,6 +4,9 @@ // RUN: %if ext_oneapi_level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %} // // CHECK-NOT: LEAK +// +// USM copy command not supported for OpenCL +// UNSUPPORTED: opencl // Tests memcpy operation using device USM and an in-order queue. diff --git a/sycl/test-e2e/Graph/device_query.cpp b/sycl/test-e2e/Graph/device_query.cpp index ba492e1435b91..0b8bb11a3921c 100644 --- a/sycl/test-e2e/Graph/device_query.cpp +++ b/sycl/test-e2e/Graph/device_query.cpp @@ -21,8 +21,13 @@ int main() { auto Backend = Device.get_backend(); if ((Backend == backend::ext_oneapi_level_zero) || - (Backend == backend::ext_oneapi_cuda) || (Backend == backend::opencl)) { + (Backend == backend::ext_oneapi_cuda)) { assert(SupportsGraphs == exp_ext::graph_support_level::native); + } else if (Backend == backend::opencl) { + // OpenCL backend support is conditional on the cl_khr_command_buffer + // extension being available + assert(SupportsGraphs == exp_ext::graph_support_level::native || + SupportsGraphs == exp_ext::graph_support_level::unsupported); } else { assert(SupportsGraphs == exp_ext::graph_support_level::unsupported); } From 3756725a42ba5ea929c65b6a7338ba1627725779 Mon Sep 17 00:00:00 2001 From: Ewan Crawford Date: Tue, 14 Nov 2023 15:38:38 +0000 Subject: [PATCH 3/6] Throw exception on unsupported entry-point Describe in exception message the name of the unsupported entry-point. --- sycl/doc/design/CommandGraph.md | 14 +++- sycl/plugins/unified_runtime/pi2ur.hpp | 1 + sycl/source/detail/memory_manager.cpp | 89 ++++++++++++++++++++------ 3 files changed, 80 insertions(+), 24 deletions(-) diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index c84fe57d4f68a..d994edc02fb43 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -266,9 +266,14 @@ support. Due to the API mapping gaps documented in the following section, OpenCL as a SYCL backend cannot fully support the graphs API. Instead there are limitations in the types on nodes which a user can add to a graph, using -an unsupported node type will cause an abort in graph finalization with the -message -`ur_die: Experimental Command-buffer entry point is not implemented for OpenCL adapter.`. +an unsupported node type will cause a sycl exception to be throw in graph +finalization with error code `sycl::errc::feature_not_support` and a message +mentioning the unsupported command. For example, + +``` +terminate called after throwing an instance of 'sycl::_V1::exception' +what(): USM copy command not supported by graph backend +``` The types of commands which are unsupported, and lead to this exception are: * `handler::copy(src, dest)` - Where `src` is an accessor and `dest` is a pointer. @@ -312,6 +317,9 @@ adapter where there is matching support for each function in the list. | | clCommandSVMMemcpyKHR | No | | | clCommandSVMMemFillKHR | No | +We are looking to address these gaps in the future so that SYCL-Graphs can be +fully supported on a `cl_khr_command_buffer` backend. + #### UR Command-Buffer Implementation Many of the OpenCL functions take a `cl_command_queue` parameter which is not diff --git a/sycl/plugins/unified_runtime/pi2ur.hpp b/sycl/plugins/unified_runtime/pi2ur.hpp index e12184677a00b..cda2d80aa7ce5 100644 --- a/sycl/plugins/unified_runtime/pi2ur.hpp +++ b/sycl/plugins/unified_runtime/pi2ur.hpp @@ -101,6 +101,7 @@ static pi_result ur2piResult(ur_result_t urResult) { return PI_ERROR_LINK_PROGRAM_FAILURE; case UR_RESULT_ERROR_UNSUPPORTED_VERSION: case UR_RESULT_ERROR_UNSUPPORTED_FEATURE: + return PI_ERROR_INVALID_OPERATION; case UR_RESULT_ERROR_INVALID_ARGUMENT: case UR_RESULT_ERROR_INVALID_NULL_HANDLE: case UR_RESULT_ERROR_HANDLE_OBJECT_IN_USE: diff --git a/sycl/source/detail/memory_manager.cpp b/sycl/source/detail/memory_manager.cpp index bf8483a47ce09..ae357a8f4fe5b 100644 --- a/sycl/source/detail/memory_manager.cpp +++ b/sycl/source/detail/memory_manager.cpp @@ -1518,10 +1518,20 @@ void MemoryManager::ext_oneapi_copyD2H_cmd_buffer( } if (1 == DimDst && 1 == DimSrc) { - Plugin->call( - CommandBuffer, sycl::detail::pi::cast(SrcMem), - SrcXOffBytes, SrcAccessRangeWidthBytes, DstMem + DstXOffBytes, - Deps.size(), Deps.data(), OutSyncPoint); + pi_result Result = + Plugin->call_nocheck( + CommandBuffer, + sycl::detail::pi::cast(SrcMem), + SrcXOffBytes, SrcAccessRangeWidthBytes, DstMem + DstXOffBytes, + Deps.size(), Deps.data(), OutSyncPoint); + + if (Result == PI_ERROR_INVALID_OPERATION) { + throw sycl::exception( + sycl::make_error_code(sycl::errc::feature_not_supported), + "Device-to-host buffer copy command not supported by graph backend"); + } else { + Plugin->checkPiResult(Result); + } } else { size_t BufferRowPitch = (1 == DimSrc) ? 0 : SrcSzWidthBytes; size_t BufferSlicePitch = @@ -1538,11 +1548,20 @@ void MemoryManager::ext_oneapi_copyD2H_cmd_buffer( SrcAccessRange[SrcPos.YTerm], SrcAccessRange[SrcPos.ZTerm]}; - Plugin->call( - CommandBuffer, sycl::detail::pi::cast(SrcMem), - &BufferOffset, &HostOffset, &RectRegion, BufferRowPitch, - BufferSlicePitch, HostRowPitch, HostSlicePitch, DstMem, Deps.size(), - Deps.data(), OutSyncPoint); + pi_result Result = + Plugin->call_nocheck( + CommandBuffer, + sycl::detail::pi::cast(SrcMem), + &BufferOffset, &HostOffset, &RectRegion, BufferRowPitch, + BufferSlicePitch, HostRowPitch, HostSlicePitch, DstMem, Deps.size(), + Deps.data(), OutSyncPoint); + if (Result == PI_ERROR_INVALID_OPERATION) { + throw sycl::exception( + sycl::make_error_code(sycl::errc::feature_not_supported), + "Device-to-host buffer copy command not supported by graph backend"); + } else { + Plugin->checkPiResult(Result); + } } } @@ -1576,10 +1595,20 @@ void MemoryManager::ext_oneapi_copyH2D_cmd_buffer( } if (1 == DimDst && 1 == DimSrc) { - Plugin->call( - CommandBuffer, sycl::detail::pi::cast(DstMem), - DstXOffBytes, DstAccessRangeWidthBytes, SrcMem + SrcXOffBytes, - Deps.size(), Deps.data(), OutSyncPoint); + pi_result Result = + Plugin->call_nocheck( + CommandBuffer, + sycl::detail::pi::cast(DstMem), + DstXOffBytes, DstAccessRangeWidthBytes, SrcMem + SrcXOffBytes, + Deps.size(), Deps.data(), OutSyncPoint); + + if (Result == PI_ERROR_INVALID_OPERATION) { + throw sycl::exception( + sycl::make_error_code(sycl::errc::feature_not_supported), + "Host-to-device buffer copy command not supported by graph backend"); + } else { + Plugin->checkPiResult(Result); + } } else { size_t BufferRowPitch = (1 == DimDst) ? 0 : DstSzWidthBytes; size_t BufferSlicePitch = @@ -1596,11 +1625,21 @@ void MemoryManager::ext_oneapi_copyH2D_cmd_buffer( DstAccessRange[DstPos.YTerm], DstAccessRange[DstPos.ZTerm]}; - Plugin->call( - CommandBuffer, sycl::detail::pi::cast(DstMem), - &BufferOffset, &HostOffset, &RectRegion, BufferRowPitch, - BufferSlicePitch, HostRowPitch, HostSlicePitch, SrcMem, Deps.size(), - Deps.data(), OutSyncPoint); + pi_result Result = + Plugin->call_nocheck( + CommandBuffer, + sycl::detail::pi::cast(DstMem), + &BufferOffset, &HostOffset, &RectRegion, BufferRowPitch, + BufferSlicePitch, HostRowPitch, HostSlicePitch, SrcMem, Deps.size(), + Deps.data(), OutSyncPoint); + + if (Result == PI_ERROR_INVALID_OPERATION) { + throw sycl::exception( + sycl::make_error_code(sycl::errc::feature_not_supported), + "Host-to-device buffer copy command not supported by graph backend"); + } else { + Plugin->checkPiResult(Result); + } } } @@ -1614,9 +1653,17 @@ void MemoryManager::ext_oneapi_copy_usm_cmd_buffer( PI_ERROR_INVALID_VALUE); const PluginPtr &Plugin = Context->getPlugin(); - Plugin->call( - CommandBuffer, DstMem, SrcMem, Len, Deps.size(), Deps.data(), - OutSyncPoint); + pi_result Result = + Plugin->call_nocheck( + CommandBuffer, DstMem, SrcMem, Len, Deps.size(), Deps.data(), + OutSyncPoint); + if (Result == PI_ERROR_INVALID_OPERATION) { + throw sycl::exception( + sycl::make_error_code(sycl::errc::feature_not_supported), + "USM copy command not supported by graph backend"); + } else { + Plugin->checkPiResult(Result); + } } void MemoryManager::copy_image_bindless( From 0c84fa5ecb1a7ef17af9d1d280d4d2cb2de42edf Mon Sep 17 00:00:00 2001 From: Martin Morrison-Grant Date: Tue, 21 Nov 2023 10:04:13 +0000 Subject: [PATCH 4/6] Update docs wording and removing REQUIRES from devicve_query.cpp --- sycl/doc/design/CommandGraph.md | 8 ++++---- sycl/test-e2e/Graph/device_query.cpp | 1 - 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index d994edc02fb43..35fcd7d0523e7 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -265,9 +265,9 @@ support. Due to the API mapping gaps documented in the following section, OpenCL as a SYCL backend cannot fully support the graphs API. Instead there are -limitations in the types on nodes which a user can add to a graph, using -an unsupported node type will cause a sycl exception to be throw in graph -finalization with error code `sycl::errc::feature_not_support` and a message +limitations in the types of nodes which a user can add to a graph, using +an unsupported node type will cause a sycl exception to be thrown in graph +finalization with error code `sycl::errc::feature_not_supported` and a message mentioning the unsupported command. For example, ``` @@ -343,7 +343,7 @@ custom defined symbols with the ones from OpenCL-Headers. Publicly available implementations of `cl_khr_command_buffer` that can be used to enable the graphs extension in OpenCL: -- [OneAPI-Construction-Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building) +- [OneAPI Construction Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building) - [PoCL](http://portablecl.org/) - [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu) diff --git a/sycl/test-e2e/Graph/device_query.cpp b/sycl/test-e2e/Graph/device_query.cpp index 0b8bb11a3921c..f1f2e680c323a 100644 --- a/sycl/test-e2e/Graph/device_query.cpp +++ b/sycl/test-e2e/Graph/device_query.cpp @@ -1,4 +1,3 @@ -// REQUIRES: opencl || cuda || level_zero // RUN: %{build} -o %t.out // RUN: %{run} %t.out From 7edd4e335c5fa1f61c54c4781b3d843feec175ba Mon Sep 17 00:00:00 2001 From: "Kenneth Benzie (Benie)" Date: Mon, 27 Nov 2023 10:56:14 +0000 Subject: [PATCH 5/6] [UR] Bump tag to 109ed46e --- sycl/plugins/unified_runtime/CMakeLists.txt | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/sycl/plugins/unified_runtime/CMakeLists.txt b/sycl/plugins/unified_runtime/CMakeLists.txt index f58bababc20a2..a0d717a6ded45 100644 --- a/sycl/plugins/unified_runtime/CMakeLists.txt +++ b/sycl/plugins/unified_runtime/CMakeLists.txt @@ -56,14 +56,14 @@ endif() if(SYCL_PI_UR_USE_FETCH_CONTENT) include(FetchContent) - set(UNIFIED_RUNTIME_REPO "https://github.com/martygrant/unified-runtime.git") - # commit ec7982bac6cb3a6b9ed610cd6b7cb41fcbc780dc - # Merge: 62e6d2f9 5fb82924 + set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git") + # commit 109ed46ee55f41317d35b2a9a20fa7a2029e9e64 + # Merge: 31b654f9 23005313 # Author: Kenneth Benzie (Benie) - # Date: Wed Nov 8 13:32:46 2023 +0000 - # Merge pull request #1022 from 0x12CC/l0_usm_error_checking_2 - # [UR][L0] Propagate OOM errors from `USMAllocationMakeResident` - set(UNIFIED_RUNTIME_TAG martin/openclCommandBuffers) + # Date: Wed Nov 22 16:04:52 2023 +0000 + # Merge pull request #966 from martygrant/martin/openclCommandBuffers + # [OpenCL] Add Command Buffer extension to OpenCL adapter. + set(UNIFIED_RUNTIME_TAG 109ed46ee55f41317d35b2a9a20fa7a2029e9e64) if(SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO) set(UNIFIED_RUNTIME_REPO "${SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO}") From 5d3869988407939857aa83087559a8fb35627fea Mon Sep 17 00:00:00 2001 From: Martin Morrison-Grant Date: Tue, 28 Nov 2023 10:44:46 +0000 Subject: [PATCH 6/6] Update OpenCL wording in SYCL-Graph documentation. --- sycl/doc/design/CommandGraph.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index 35fcd7d0523e7..47c835d664330 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -249,22 +249,22 @@ graph resubmission. ### OpenCL -SYCL-Graphs is only enabled for an OpenCL backend when the +SYCL-Graph is only enabled for an OpenCL backend when the [cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer) extension is available, however this information isn't available until runtime due to OpenCL implementations being loaded through an ICD. The `ur_exp_command_buffer` string is conditionally returned from the OpenCL command-buffer UR backend at runtime based on `cl_khr_command_buffer` support -to indicate that the graphs extension should be enabled. This is information +to indicate that the graph extension should be enabled. This is information is propagated to the SYCL user via the -`device.get_info()` query for graphs extension +`device.get_info()` query for graph extension support. #### Limitations Due to the API mapping gaps documented in the following section, OpenCL as a -SYCL backend cannot fully support the graphs API. Instead there are +SYCL backend cannot fully support the graph API. Instead, there are limitations in the types of nodes which a user can add to a graph, using an unsupported node type will cause a sycl exception to be thrown in graph finalization with error code `sycl::errc::feature_not_supported` and a message @@ -317,7 +317,7 @@ adapter where there is matching support for each function in the list. | | clCommandSVMMemcpyKHR | No | | | clCommandSVMMemFillKHR | No | -We are looking to address these gaps in the future so that SYCL-Graphs can be +We are looking to address these gaps in the future so that SYCL-Graph can be fully supported on a `cl_khr_command_buffer` backend. #### UR Command-Buffer Implementation @@ -341,9 +341,9 @@ custom defined symbols with the ones from OpenCL-Headers. #### Available OpenCL Command-Buffer Implementations Publicly available implementations of `cl_khr_command_buffer` that can be used -to enable the graphs extension in OpenCL: +to enable the graph extension in OpenCL: - [OneAPI Construction Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building) - [PoCL](http://portablecl.org/) -- [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu) +- [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/efeae73139ddf064fafce565cc39640af10d900f/layers/10_cmdbufemu)