Skip to content

v0.2

Compare
Choose a tag to compare
@raffenet raffenet released this 25 Jul 17:49
· 89 commits to main since this release
885970b

Changes in 0.2

  • Add support for reduction operations (e.g. sum, prod, min, max, ...)

  • Add support for AMD GPUs via HIP backend

  • Add "nogpu" info hint to avoid unnecessary pointer attribute queries

  • Add stream-based pack/unpack APIs

  • Add blocking pack/unpack APIs

  • Add support for NVIDIA HPC SDK compilers

  • Improve compile time for Level Zero kernels

  • Extend tests to support subdevices (tiles) of Intel GPUs

  • Many bug fixes and code cleanups