You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to modify this sycl/pvc example, which I believe does $$matmul(A, B)$$, to do $$matmul(A, B^T)$$ (i.e., my B input is transposed) but the verification is failing.
I made these changes
changed line (using LayoutB = cutlass::layout::RowMajor;) to using LayoutB = cutlass::layout::ColumnMajor; and
changed line (cutlass::TensorRef ref_B(block_B.get(), LayoutB::packed({K, N})); to cutlass::TensorRef ref_B(block_B.get(), LayoutB::packed({N, K}));
This line could also be relevant so I also tried with or without the following change, in addition to the above two changes,
changed line (stride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(N, K, L));) to stride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(K, N, L));
Is it possible to modify that example code to do $$matmul(A, B^T)$$? If yes, could you please help me modify the code correctly?
I also suspect it might not be possible since the MMA_Atom MMA_Atom<XE_8x16x16_F32BF16BF16F32_TT> might not support it but I'm not sure.
Complementary questions:
Does PVC support BF16 inputs only when using xe cores or other input types are also supported? If it does support other input types, what is your plan to create MMA_Atoms for them?
Thanks!
The text was updated successfully, but these errors were encountered:
This line could also be relevant so I also tried with or without the following change, in addition to the above two changes,
changed line (stride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(N, K, L));) to stride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(K, N, L));
You don't need this changes as by converting the RowMajor, to Colmajor the tag_to_stride_B will capture the correct make_cute_packed_stride function for selecting N as an stride. So this change is not needed
Is it possible to modify that example code to do m a t m u l ( A , B T ) ? If yes, could you please help me modify the code correctly?
I also suspect it might not be possible since the MMA_Atom MMA_Atom<XE_8x16x16_F32BF16BF16F32_TT> might not support it but I'm not sure.
At the moment the functionality to load B transpose for xe_copy are not integrated to the pipeline. we are in the process of adding them and we let you know once it is done.
Complementary questions: Does PVC support BF16 inputs only when using xe cores or other input types are also supported? If it does support other input types, what is your plan to create MMA_Atoms for them? Thanks!
Currently it is for BF16, Adding others are in progress.
How to do$$matmul(A, B^T)$$ ?
I was trying to modify this sycl/pvc example, which I believe does$$matmul(A, B)$$ , to do $$matmul(A, B^T)$$ (i.e., my B input is transposed) but the verification is failing.
I made these changes
using LayoutB = cutlass::layout::RowMajor;
) tousing LayoutB = cutlass::layout::ColumnMajor;
andcutlass::TensorRef ref_B(block_B.get(), LayoutB::packed({K, N}));
tocutlass::TensorRef ref_B(block_B.get(), LayoutB::packed({N, K}));
This line could also be relevant so I also tried with or without the following change, in addition to the above two changes,
stride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(N, K, L));
) tostride_B = cutlass::make_cute_packed_stride(StrideB{}, cute::make_shape(K, N, L));
Is it possible to modify that example code to do$$matmul(A, B^T)$$ ? If yes, could you please help me modify the code correctly?
I also suspect it might not be possible since the MMA_Atom
MMA_Atom<XE_8x16x16_F32BF16BF16F32_TT>
might not support it but I'm not sure.Complementary questions:
Does PVC support BF16 inputs only when using xe cores or other input types are also supported? If it does support other input types, what is your plan to create MMA_Atoms for them?
Thanks!
The text was updated successfully, but these errors were encountered: