Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misreading this but today it looks like we will go into this if statement for
sizeof(T) == 8
data, and end up callling__esimd_lsc_load_merge_slm
With this change we will go into the final else and call
__esimd_gather_masked_scaled2
right? Is that correct?If so, can you explain why we want to call this intrinsic instead of the other? The current one seems to be the LSC one which I would expect to be required for 64-bit data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the change, if you pass 64 bit data as T and no MaskedGatherScatter is available you will go to the final else where gather_impl will be called and eventually fail with assertion (old gather does not support 64 bit data out of the box) If you have MaskedGatherScatter available it will be called no matter what data type is passed. With this change if MaskedGatherScatter is not available and 64 bit data is passed then lsc version will be called that does support 64 bit data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change here looks good. Please add a test case for 64-bit types to slm_gather.cpp (the one that is compiled with new LLVM IR available).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't add a new test that uses 64 bit types at least for now due to test issues. I can't use slm_block_store to initialize the SLM memory due to driver issues and slm_scatter doesn't support 64 bit data. Once one of these problems is solved one way or another then I can add 64 bit data tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit-cast to 32-bits + 32-bit slm_scatter can be used to initialize SLM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do you think it is better to create a special code path in the test for 64 bit rather than add support for 64 bit data to gather/scatter first and then simply add the tests ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, E2E test can wait for slm_scatter to support 8-bytes, but please create compile-time only test-case(s) in memory_properties.cpp test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test, although it is pretty much useless since memory_properties.cpp is compiled with -D__ESIMD_GATHER_SCATTER_LLVM_IR which means, no matter what data type is used new LLVM IR is used rather than old implementation