Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SYCL Spec][Joint Matrix] Add a new overload for joint_matrix_apply t…
…o be able to return result into a different matrix (intel#13153) Currently, CUDA code that use this pattern: for (int i = 0; i < c_frag.num_elements; i++) { c_frag.x[i] = alpha * acc_frag.x[i] + beta * c_frag.x[i]; } cannot be migrated to SYCL joint matrix. This added overload addresses this limitation.
- Loading branch information