-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NATIVECPU] Support reqd_work_group_size on Native CPU #1477
[NATIVECPU] Support reqd_work_group_size on Native CPU #1477
Conversation
@@ -67,6 +78,10 @@ struct ur_kernel_handle_t_ : RefCounted { | |||
} | |||
} | |||
|
|||
bool hasReqdWGSize() { return HasReqdWGSize; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 2 new methods hasReqdWGSize
and getReqdWGSize
could be made const
but that could probably be done in a subsequent PR to not hold up this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you
size_t MDElemsSize = MetadataElement.size - sizeof(std::uint64_t); | ||
|
||
// Expect between 1 and 3 32-bit integer values. | ||
UR_ASSERT(MDElemsSize >= sizeof(std::uint32_t) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to directly check for the three expected multiples of sizeof(std::uint32_t)
in case the meta data format has changed or got corrupted. This would also catch a possible underflow in MetadataElement.size - sizeof(std::uint64_t)
, or a case when MDElemsSize
is not a multiple of sizeof(std::uint32_t)
for whatever reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you are right, I've done that
if (Tag == __SYCL_UR_PROGRAM_METADATA_TAG_REQD_WORK_GROUP_SIZE) { | ||
native_cpu::ReqdWGSize_t reqdWGSize; | ||
getReqdWGSize(mdNode, reqdWGSize); | ||
hProgram->KernelReqdWorkGroupSizeMD[Prefix] = reqdWGSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: The right-hand side could perhaps be std::move(reqdWGSize)
which may not make a difference now, but just in case native_cpu::ReqdWGSize_t
becomes a moveable type in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you
5a05291
to
b7ce8d2
Compare
b7ce8d2
to
5e5b7ac
Compare
auto [Prefix, Tag] = splitMetadataName(mdName); | ||
if (Tag == __SYCL_UR_PROGRAM_METADATA_TAG_REQD_WORK_GROUP_SIZE) { | ||
native_cpu::ReqdWGSize_t reqdWGSize; | ||
getReqdWGSize(mdNode, reqdWGSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the value returned by getReqdWGSize
seems to be missing (unless UR_ASSERT is implemented as throwing an exception)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting it, I've added a check
|
||
ur_kernel_handle_t_(const ur_kernel_handle_t_ &other) | ||
: _name(other._name), _subhandler(other._subhandler), _args(other._args), | ||
_localArgInfo(other._localArgInfo), _localMemPool(other._localMemPool), | ||
_localMemPoolSize(other._localMemPoolSize) { | ||
_localMemPoolSize(other._localMemPoolSize), | ||
HasReqdWGSize(other.HasReqdWGSize), ReqdWGSize(other.ReqdWGSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there also be a hProgram(other.hProgram)
in the mem initiliser list?
if (res != UR_RESULT_SUCCESS) { | ||
return res; | ||
} | ||
hProgram->KernelReqdWorkGroupSizeMD[Prefix] = std::move(reqdWGSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor (for a later PR): It might be worth diagnosing when an existing prefix gets overwritten with a different reqdWGSize
Adds support for the `reqd_work_group_size` attribute on Native CPU. The change in the driver is required in order to have the metadata node available in the runtime. UR PR: oneapi-src/unified-runtime#1477
Adds support to the
reqd_work_group_size
kernel attribute. The metadata handling mechanism is similar to what is done in the CUDA adapter.DPC++ PR: intel/llvm#13175