-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update NCCL, CUDA, cuDNN, and HPC-X #31
Conversation
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762301 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762301 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762301 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762301 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762300 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762300 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762300 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762300 |
@Eta0 Build complete, success: https://github.com/coreweave/nccl-tests/actions/runs/8161762300 |
Many Updates!
This change updates the following components:
NCCL
NCCL is updated to version 2.20.3-1 for supported CUDA & OS versions, which are:
CUDA
The CUDA 12.3 releases have been bumped to the 12.3.2 patch, and the CUDA 12.3 × Ubuntu 20.04 build has been enabled, since it had been commented out previously due to issues that should already be fixed.
cuDNN
The CUDA 12.3 releases now use the newest version of cuDNN: cuDNN 9. This is exclusively available for CUDA 12.3 and the only cuDNN version available for CUDA 12.3, with the
nvidia/cuda
base images.As noted in the cuDNN 9 release notes:
So there may or may not be downstream compatibility issues to work out (for example, with the PyTorch builds in coreweave/ml-containers), but it should not be more disruptive than these images' previous state of not having a cuDNN distribution included at all.
HPC-X
The HPC-X distribution is updated from 2.16 & 2.17 to 2.18 on all CUDA 12 releases.