rccl-2.7.9 for ROCm 3.10.0
New Features
- Added experimental support for clique-based kernels (opt in with RCCL_ENABLE_CLIQUE=1)
- Clique-based kernels may offer better performance for smaller input sizes
- Clique-based kernels are currently only enabled for AllReduce under a certain byte limit (controlled via RCCL_CLIQUE_ALL_REDUCE_BYTE_LIMIT)
- Performance improvements for Rome-based systems
Known Issues
- Clique-based kernels are currently experimental and have not been fully tested on all topologies. By default clique-based kernels are disabled if the topology is not supported (override with RCCL_FORCE_ENABLE_CLIQUE)
- Clique-based kernels may hang if there are differences between environment variables set across ranks