Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update list of CI users testing Improvements to tests or testing infrastructure
#1340 opened Nov 15, 2024 by timmoon10 Loading…
8 of 14 tasks
[Common] Moved framework agnostic THD kernels to common.
#1339 opened Nov 15, 2024 by mgoldfarb-nvidia Loading…
8 of 13 tasks
Debug nightly docs documentation Improvements or additions to documentation testing Improvements to tests or testing infrastructure
#1338 opened Nov 15, 2024 by timmoon10 Loading…
4 of 13 tasks
[C/JAX] Comm+GEMM Overlap API for TE/JAX enhancement New feature or request jax
#1337 opened Nov 15, 2024 by denera Draft
3 of 13 tasks
[PyTorch] Store module extra state in tensor bug Something isn't working
#1335 opened Nov 15, 2024 by timmoon10 Loading…
8 of 13 tasks
[JAX] WIP Added L0 Distributed Tests
#1331 opened Nov 14, 2024 by phu0ngng Draft
13 tasks
[PyTorch] Integration test for Megatron-LM 1.13.0 bug Something isn't working
#1329 opened Nov 13, 2024 by timmoon10 Loading…
9 of 14 tasks
[PyTorch] Fix GQA error message 1.13.0
#1328 opened Nov 12, 2024 by cyanguwa Loading…
8 of 13 tasks
[COMMON/JAX] Support sliding window on THD format
#1327 opened Nov 11, 2024 by zlsh80826 Loading…
6 of 13 tasks
Build with uv instead of just pip
#1324 opened Nov 8, 2024 by jennifgcrl Loading…
5 of 13 tasks
TP communication overlap: enable the overlap between GEMM chunk at Ho…
#1311 opened Nov 4, 2024 by erhoo82 Loading…
1 of 13 tasks
[PyTorch] Add heuristics for intializing FP8 params enhancement New feature or request
#1300 opened Oct 30, 2024 by timmoon10 Loading…
8 of 13 tasks
Offloading example
#1299 opened Oct 29, 2024 by sanandaraj5597 Loading…
[PyTorch] Fix get_swa_mask() for padding masks
#1281 opened Oct 21, 2024 by cyanguwa Loading…
6 of 13 tasks
[PyTorch] Fix autocast deprecation warnings
#1277 opened Oct 21, 2024 by yaox12 Loading…
13 tasks
attention_mask fill with -inf for UnfusedDotProductAttention
#1268 opened Oct 18, 2024 by Agoniii Loading…
1 of 13 tasks
Draft: reduce cudagraph mem via preoallcations
#1253 opened Oct 15, 2024 by JimmyZhang12 Loading…
13 tasks
fused out correction in CP
#1248 opened Oct 14, 2024 by xiaoyao0115 Loading…
12 tasks
Save CUDA Graph memory by reusing input and output tensors
#1234 opened Oct 9, 2024 by buptzyb Loading…
5 of 13 tasks
Support CUDA Graph for MoE models
#1233 opened Oct 9, 2024 by buptzyb Loading…
6 of 13 tasks
ProTip! Follow long discussions with comments:>50.