Evaluate USE_ACCEL=opencl #683

hfp · 2023-06-29T07:41:15Z

Evaluate USE_ACCEL=opencl and ideally share some feedback. There are tuned parameters for the following GPUs: P100, V100, A100-40GB, A100-80GB, H100, and PVC. For practically all GPU vendors, OpenCL is simply part of the "native" or preferred GPU runtime installation, e.g., installing CUDA installs Nvidia's OpenCL runtime as well. The OpenCL backend in DBCSR does not bail-out for kernels without tuned parameters and it carries tuned defaults for common GPUs, i.e., tuned parameters are not exactly necessary.

Standalone DBCSR has equal support for CUDA and OpenCL except OpenCL not falling back to larger GPU-supported GEMMs. For CP2K, OpenCL can be used as well up to the DBCSR support. However, CP2K can use DBCSR with OpenCL and CUDA otherwise (tested on Nvidia platforms). Otherwise means GRID, DBM, DBT, FFT, and CUDA-enabled dependencies like ELPA or COSMA. For the latter, SYCL or OpenMP support for GPUs may be available as well.

For Nvidia based platforms (not HIP), some HPC deployments are set to "exclusive mode" (see nvidia-smi) means that OpenCL-enabled applications cannot be used with multiple ranks per GPU. This can be lifted easily but requires a setup to either change or allow user option to toggle the compute mode.

The outcome of an evaluation can be ideally used to guide future development or contributions.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate USE_ACCEL=opencl #683

Evaluate USE_ACCEL=opencl #683

hfp commented Jun 29, 2023

Evaluate USE_ACCEL=opencl #683

Evaluate USE_ACCEL=opencl #683

Comments

hfp commented Jun 29, 2023