Release Notes
This release brings significant stability and performance improvements, enhanced support for CUDA, new HIP/ROCm library ports and integrations for HipBLAS, HipFFT, HipRAND/RocRAND. Initial testing of running HIP/CUDA applications on RISC-V.
Tested Platforms
- Intel, AMD CPUs via Intel Compute Runtime
- Intel GPUs via Neo i915 driver
- ARM Mali GPUs (Quartz64 SBC)
- RISC-V (Starfive Visionfive 2 SBC Debian, experimental)
- AMD GPUs via rusticl(exploratory work)
Notable Changes
-
Introduced
cucc
, a drop-in replacement fornvcc
:- Added
cucc
, enabling direct compilation of CUDA sources. - Added
nvcc
softlink, allowing you to compile CUDA sources without making any changes. - Adjusted CUDA headers to improve compatibility with CUDA sources, including a dummy
cublas_v2.h
header to prevent conflicts with system headers.
- Added
-
Enhanced OpenCL backend:
- Support for
cl_ext_buffer_device_address
extension:- Added support for devices featuring the
cl_ext_buffer_device_address
extension, improving memory management capabilities.
- Added support for devices featuring the
- Optimized queue profiling:
- The OpenCL backend now uses non-profiling queues by default and switches to profiling queues only when needed, resulting in performance improvements.
- Various other performance optimizations
- Support for
-
Fixed Level Zero backend issues:
- Addressed out-of-memory (OOM) errors:
- Fixed memory leaks and improved resource management to prevent OOM errors during heavy workloads.
- Improved thread safety:
- Implemented mutexes and synchronization mechanisms to enhance thread safety within the Level Zero backend.
- Addressed out-of-memory (OOM) errors:
-
Rebased to HIP 6.x and updated hip-tests:
- Updated the codebase to be compatible with HIP 6.x.
Library Support Changes
- Expanded HIP library support:
- HipBLAS integration:
- Introduced the
CHIP_BUILD_HIPBLAS
option to enable building HipBLAS.
- Introduced the
- HipFFT integration:
- Introduced the
CHIP_BUILD_HIPFFT
option to enable building HipFFT.
- Introduced the
- RocRAND port:
- HipBLAS integration: