-
Notifications
You must be signed in to change notification settings - Fork 327
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
added XLA custom op defs for TE GEMM
Signed-off-by: Alp Dener <adener@nvidia.com> Added XLA FFI custom op for TE GEMM Signed-off-by: Alp Dener <adener@nvidia.com> finished GEMM custom op primitive and serial unit test Signed-off-by: Alp Dener <adener@nvidia.com> fixed GEMM custom op batcher Signed-off-by: Alp Dener <adener@nvidia.com> fixed output dtype error and contracting dimensions options Signed-off-by: Alp Dener <adener@nvidia.com> AG overlap working but executes scatter to match outer LHS dim Signed-off-by: Alp Dener <adener@nvidia.com> both all-gather and all-reduce are now working Signed-off-by: Alp Dener <adener@nvidia.com> code style Signed-off-by: Alp Dener <adener@nvidia.com> changed kwargs in abstract to be explicit Signed-off-by: Alp Dener <adener@nvidia.com> added fwd/bwd implementation for non-fp8 gemm Signed-off-by: Alp Dener <adener@nvidia.com>
- Loading branch information
Showing
12 changed files
with
1,370 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.