Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][ESIMD] Make 64 bit data use lsc version of slm_gather implementation #12595

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sycl/include/sycl/ext/intel/esimd/memory.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3921,7 +3921,7 @@ slm_gather(simd<uint32_t, N / VS> byte_offsets, simd_mask<N / VS> mask,
static_assert(Alignment >= sizeof(T),
"slm_gather() requires at least element-size alignment");

if constexpr (VS > 1 || (!detail::isPowerOf2(N, 32) &&
if constexpr (VS > 1 || (!(detail::isPowerOf2(N, 32) && sizeof(T) <= 4) &&
Copy link
Contributor

@sarnex sarnex Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misreading this but today it looks like we will go into this if statement for sizeof(T) == 8 data, and end up callling __esimd_lsc_load_merge_slm

With this change we will go into the final else and call __esimd_gather_masked_scaled2 right? Is that correct?

If so, can you explain why we want to call this intrinsic instead of the other? The current one seems to be the LSC one which I would expect to be required for 64-bit data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the change, if you pass 64 bit data as T and no MaskedGatherScatter is available you will go to the final else where gather_impl will be called and eventually fail with assertion (old gather does not support 64 bit data out of the box) If you have MaskedGatherScatter available it will be called no matter what data type is passed. With this change if MaskedGatherScatter is not available and 64 bit data is passed then lsc version will be called that does support 64 bit data.

Copy link
Contributor

@v-klochkov v-klochkov Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here looks good. Please add a test case for 64-bit types to slm_gather.cpp (the one that is compiled with new LLVM IR available).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't add a new test that uses 64 bit types at least for now due to test issues. I can't use slm_block_store to initialize the SLM memory due to driver issues and slm_scatter doesn't support 64 bit data. Once one of these problems is solved one way or another then I can add 64 bit data tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit-cast to 32-bits + 32-bit slm_scatter can be used to initialize SLM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you think it is better to create a special code path in the test for 64 bit rather than add support for 64 bit data to gather/scatter first and then simply add the tests ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, E2E test can wait for slm_scatter to support 8-bytes, but please create compile-time only test-case(s) in memory_properties.cpp test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test, although it is pretty much useless since memory_properties.cpp is compiled with -D__ESIMD_GATHER_SCATTER_LLVM_IR which means, no matter what data type is used new LLVM IR is used rather than old implementation

!detail::isMaskedGatherScatterLLVMAvailable())) {
simd<T, N> PassThru; // Intentionally undefined
return detail::slm_gather_impl<T, VS, detail::lsc_data_size::default_size>(
Expand Down
5 changes: 5 additions & 0 deletions sycl/test/esimd/memory_properties.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1303,4 +1303,9 @@ test_slm_gather_scatter(int byte_offset32) {
props_align4);
slm = slm_gather<float, 32, 2>(ioffset_n16_view, mask_n16, pass_thru_view,
props_align4);

// Special case to verify calls to slm_gather with 64 bit data type are
// transformed to lsc calls
// CHECK-COUNT-1: call <32 x double> @llvm.masked.gather.v32f64.v32p3(<32 x ptr addrspace(3)> {{[^)]+}}, i32 8, <32 x i1> {{[^)]+}}, <32 x double> {{[^)]+}})
auto slm_double = slm_gather<double>(ioffset_n32, mask_n32);
}
Loading