Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix
GenISA_WaveShuffleIndex
intrinsic if src
and dst
are the s…
…ame variables To handle `GenISA_WaveShuffleIndex` intrisc with non-uniform `simdChannel`, IGC needs to generate two SIMD16 indirectly addressed mov instructions, because address register has only 16 subregisters. If that happens when `GenISA_WaveShuffleIndex` intrinsic uses the same variable as a source, and as a destination, then the first SIMD16 instruction may overwrite values used as a source by the second SIMD16 instruction. Here is the example of an OpenCL C code that reproduces the issue: ```c __attribute__((intel_reqd_sub_group_size(32))) kernel void k(global int* in, global int* ids, uint num_iterations, global int* out) { size_t gid = get_global_id(0); int x = in[gid]; uint which_sub_group_local_id = ids[gid]; for (uint i = 0; i < num_iterations; ++i) { x = intel_sub_group_shuffle(x, which_sub_group_local_id); } out[gid] = x; } ``` This change fixes the issue by writing the result for the first 16 channels into a temporary variable, before executing shuffle index for the last 16 channels
- Loading branch information