Skip to content

Actions: NVIDIA/TransformerEngine

Deploy nightly docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
567 workflow run results
567 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Convert non-kernel cuda files to cpp (#1322)
Deploy nightly docs #720: Commit 68adf45 pushed by ksivaman
November 11, 2024 14:37 1m 9s main
November 11, 2024 14:37 1m 9s
[JAX] Support Ring Attention (Context Parallelism) (#1059)
Deploy nightly docs #719: Commit bfddb48 pushed by mgoldfarb-nvidia
November 11, 2024 14:25 1m 9s main
November 11, 2024 14:25 1m 9s
[C] Separating cudnn common utils from fused_attn (#1314)
Deploy nightly docs #718: Commit 2643ba1 pushed by phu0ngng
November 8, 2024 21:03 1m 22s main
November 8, 2024 21:03 1m 22s
[JAX] Added prepare phase for the FusedAttnForwardFFI (#1313)
Deploy nightly docs #717: Commit e5ffaa7 pushed by phu0ngng
November 7, 2024 03:54 1m 9s main
November 7, 2024 03:54 1m 9s
[TE/JAX] XLA FFI calls for three cast transpose functions (#1310)
Deploy nightly docs #716: Commit 4d65073 pushed by huanghua1994
November 6, 2024 23:04 1m 20s main
November 6, 2024 23:04 1m 20s
[JAX] Add back the xla deterministic flag (#1301)
Deploy nightly docs #715: Commit d4aa299 pushed by phu0ngng
November 6, 2024 22:04 1m 19s main
November 6, 2024 22:04 1m 19s
Update list of CI users (#1316)
Deploy nightly docs #714: Commit 8f45c58 pushed by timmoon10
November 6, 2024 21:53 1m 21s main
November 6, 2024 21:53 1m 21s
[PyTorch] Userbuffers support in operation-based API (#1142)
Deploy nightly docs #713: Commit 095b27d pushed by timmoon10
November 6, 2024 01:19 1m 7s main
November 6, 2024 01:19 1m 7s
[PyTorch] Normalization ops (#1033)
Deploy nightly docs #712: Commit 77c37d4 pushed by timmoon10
November 5, 2024 21:16 1m 55s main
November 5, 2024 21:16 1m 55s
[PyTorch] Debug checkpointing with operation-based API (#1063)
Deploy nightly docs #711: Commit f20d3dd pushed by timmoon10
November 5, 2024 17:28 1m 23s main
November 5, 2024 17:28 1m 23s
[PyTorch] Debug CUDA graph support with operation-based API (#1117)
Deploy nightly docs #710: Commit 50b22da pushed by timmoon10
November 5, 2024 17:28 1m 25s main
November 5, 2024 17:28 1m 25s
[TE/JAX] XLA FFI calls for layer norm and RMS norm (#1290)
Deploy nightly docs #709: Commit df94903 pushed by huanghua1994
November 4, 2024 16:21 1m 11s main
November 4, 2024 16:21 1m 11s
[JAX] Expose context parallel params to jax DPA api (#1292)
Deploy nightly docs #708: Commit d725686 pushed by mgoldfarb-nvidia
November 4, 2024 15:41 1m 11s main
November 4, 2024 15:41 1m 11s
[PyTorch] Make FP8 MHA work with RoPE when CP is on (#1297)
Deploy nightly docs #707: Commit c42beef pushed by yaox12
November 4, 2024 09:43 1m 23s main
November 4, 2024 09:43 1m 23s
[PyTorch] Missing intra-domain ranks list when initializing Userbuffe…
Deploy nightly docs #706: Commit a6a9141 pushed by denera
November 2, 2024 01:20 1m 28s main
November 2, 2024 01:20 1m 28s
[JAX] Fix for Disable FusedAttn with FFI by default (#1304)
Deploy nightly docs #705: Commit 4b8ffef pushed by phu0ngng
November 1, 2024 19:43 1m 6s main
November 1, 2024 19:43 1m 6s
Support using fp16 master weights and fp16/fp8 optimizer states in Fu…
Deploy nightly docs #704: Commit 05c0fb0 pushed by timmoon10
November 1, 2024 17:47 1m 16s main
November 1, 2024 17:47 1m 16s
[TE/JAX] Disable FusedAttn with FFI by default (#1298)
Deploy nightly docs #703: Commit 23caab3 pushed by phu0ngng
October 31, 2024 14:33 1m 23s main
October 31, 2024 14:33 1m 23s
[TE/JAX] Custom call with FFI - lowering all attributes with bind all…
Deploy nightly docs #702: Commit 9dddb36 pushed by phu0ngng
October 31, 2024 14:32 1m 21s main
October 31, 2024 14:32 1m 21s
[JAX] Consolidate FFI and old descriptor implementation for fused att…
Deploy nightly docs #701: Commit c036765 pushed by phu0ngng
October 30, 2024 01:05 1m 10s main
October 30, 2024 01:05 1m 10s
Add missed arguments of apply_rotary_pos_emb in MHA (#1296)
Deploy nightly docs #700: Commit ed1e85c pushed by xrennvidia
October 30, 2024 00:21 1m 23s main
October 30, 2024 00:21 1m 23s
Add check for GPU availability in attention (#1287)
Deploy nightly docs #699: Commit 8bdb54f pushed by cyanguwa
October 29, 2024 20:30 1m 9s main
October 29, 2024 20:30 1m 9s
[PyTorch] Skip t3hd/th3d for MQA/GQA tests (#1293)
Deploy nightly docs #698: Commit d710c24 pushed by cyanguwa
October 29, 2024 20:26 1m 17s main
October 29, 2024 20:26 1m 17s
[C/PyTorch] Userbuffers and comm+GEMM overlap algorithms refactored a…
Deploy nightly docs #697: Commit 933294d pushed by denera
October 29, 2024 15:06 1m 12s main
October 29, 2024 15:06 1m 12s
[PyTorch] Remove fast param getter from modules (#1291)
Deploy nightly docs #696: Commit 35bbe74 pushed by timmoon10
October 28, 2024 23:47 1m 5s main
October 28, 2024 23:47 1m 5s