[NATIVECPU] Support reqd_work_group_size on Native CPU #1477

PietroGhg · 2024-03-27T10:36:03Z

Adds support to the reqd_work_group_size kernel attribute. The metadata handling mechanism is similar to what is done in the CUDA adapter.
DPC++ PR: intel/llvm#13175

uwedolinsky · 2024-04-05T08:39:56Z

source/adapters/native_cpu/kernel.hpp

@@ -67,6 +78,10 @@ struct ur_kernel_handle_t_ : RefCounted {
    }
  }

+  bool hasReqdWGSize() { return HasReqdWGSize; }


These 2 new methods hasReqdWGSize and getReqdWGSize could be made const but that could probably be done in a subsequent PR to not hold up this one.

Done, thank you

uwedolinsky · 2024-04-05T09:19:44Z

source/adapters/native_cpu/program.cpp

+  size_t MDElemsSize = MetadataElement.size - sizeof(std::uint64_t);
+
+  // Expect between 1 and 3 32-bit integer values.
+  UR_ASSERT(MDElemsSize >= sizeof(std::uint32_t) &&


It might be better to directly check for the three expected multiples of sizeof(std::uint32_t) in case the meta data format has changed or got corrupted. This would also catch a possible underflow in MetadataElement.size - sizeof(std::uint64_t), or a case when MDElemsSize is not a multiple of sizeof(std::uint32_t) for whatever reason.

Yeah you are right, I've done that

uwedolinsky · 2024-04-05T09:22:53Z

source/adapters/native_cpu/program.cpp

+      if (Tag == __SYCL_UR_PROGRAM_METADATA_TAG_REQD_WORK_GROUP_SIZE) {
+        native_cpu::ReqdWGSize_t reqdWGSize;
+        getReqdWGSize(mdNode, reqdWGSize);
+        hProgram->KernelReqdWorkGroupSizeMD[Prefix] = reqdWGSize;


Minor: The right-hand side could perhaps be std::move(reqdWGSize) which may not make a difference now, but just in case native_cpu::ReqdWGSize_t becomes a moveable type in the future.

Done, thank you

uwedolinsky · 2024-04-18T12:03:41Z

source/adapters/native_cpu/program.cpp

+      auto [Prefix, Tag] = splitMetadataName(mdName);
+      if (Tag == __SYCL_UR_PROGRAM_METADATA_TAG_REQD_WORK_GROUP_SIZE) {
+        native_cpu::ReqdWGSize_t reqdWGSize;
+        getReqdWGSize(mdNode, reqdWGSize);


Checking the value returned by getReqdWGSize seems to be missing (unless UR_ASSERT is implemented as throwing an exception)

Thanks for spotting it, I've added a check

uwedolinsky · 2024-04-18T14:23:20Z

source/adapters/native_cpu/kernel.hpp


  ur_kernel_handle_t_(const ur_kernel_handle_t_ &other)
      : _name(other._name), _subhandler(other._subhandler), _args(other._args),
        _localArgInfo(other._localArgInfo), _localMemPool(other._localMemPool),
-        _localMemPoolSize(other._localMemPoolSize) {
+        _localMemPoolSize(other._localMemPoolSize),
+        HasReqdWGSize(other.HasReqdWGSize), ReqdWGSize(other.ReqdWGSize) {


Should there also be a hProgram(other.hProgram) in the mem initiliser list?

uwedolinsky · 2024-04-23T11:01:42Z

source/adapters/native_cpu/program.cpp

+        if (res != UR_RESULT_SUCCESS) {
+          return res;
+        }
+        hProgram->KernelReqdWorkGroupSizeMD[Prefix] = std::move(reqdWGSize);


Minor (for a later PR): It might be worth diagnosing when an existing prefix gets overwritten with a different reqdWGSize

Adds support for the `reqd_work_group_size` attribute on Native CPU. The change in the driver is required in order to have the metadata node available in the runtime. UR PR: oneapi-src/unified-runtime#1477

PietroGhg requested a review from a team as a code owner March 27, 2024 10:36

PietroGhg mentioned this pull request Mar 27, 2024

[SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU intel/llvm#13175

Merged

uwedolinsky reviewed Apr 5, 2024

View reviewed changes

kbenzie added the native-cpu Native CPU adapter specific issues label Apr 10, 2024

PietroGhg force-pushed the pietro/reqd_work_group_size branch from 5a05291 to b7ce8d2 Compare April 17, 2024 14:45

Support reqd_work_group_size on native cpu

5e5b7ac

PietroGhg force-pushed the pietro/reqd_work_group_size branch from b7ce8d2 to 5e5b7ac Compare April 17, 2024 15:27

uwedolinsky reviewed Apr 18, 2024

View reviewed changes

add check for return value

d6ae2a9

uwedolinsky reviewed Apr 18, 2024

View reviewed changes

Add hProgram in kernel copy constructor

fa084d0

uwedolinsky approved these changes Apr 23, 2024

View reviewed changes

PietroGhg added the ready to merge Added to PR's which are ready to merge label May 3, 2024

kbenzie merged commit 50452b7 into oneapi-src:main May 15, 2024
51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NATIVECPU] Support reqd_work_group_size on Native CPU #1477

[NATIVECPU] Support reqd_work_group_size on Native CPU #1477

PietroGhg commented Mar 27, 2024 •

edited

Loading

uwedolinsky Apr 5, 2024

PietroGhg Apr 8, 2024

uwedolinsky Apr 5, 2024

PietroGhg Apr 8, 2024

uwedolinsky Apr 5, 2024

PietroGhg Apr 8, 2024

uwedolinsky Apr 18, 2024

PietroGhg Apr 18, 2024

uwedolinsky Apr 18, 2024

uwedolinsky Apr 23, 2024

[NATIVECPU] Support reqd_work_group_size on Native CPU #1477

[NATIVECPU] Support reqd_work_group_size on Native CPU #1477

Conversation

PietroGhg commented Mar 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PietroGhg commented Mar 27, 2024 •

edited

Loading