diff --git a/doc/source/reference/makefile.rst b/doc/source/reference/makefile.rst index bf8ecdad..4f948e5d 100644 --- a/doc/source/reference/makefile.rst +++ b/doc/source/reference/makefile.rst @@ -130,32 +130,56 @@ Finally, *Idefix* can be configured to run on Mi250 by enabling HIP and the desi MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual. -Jean Zay at IDRIS, Nvidia V100 and A100 GPUs --------------------------------------------- +Jean Zay at IDRIS, Nvidia V100/A100/H100 GPUs +--------------------------------------------- -We recommend the following modules and environement variables on Jean Zay: +We recommend the following modules and environement variables on Jean Zay V100/A100: .. code-block:: bash + module load arch/a100 # ONLY forA100 module load cuda/12.1.0 module load gcc/12.2.0 module load openmpi/4.1.1-cuda - module load cmake/3.18.0 + module load cmake/3.25.2 + +While for H100: + +.. code-block:: bash + + module load arch/h100 + module load cmake/3.30.1 + module load cuda/12.1.0 + module load openmpi/4.1.5-cuda *Idefix* can then be configured to run on Nvidia V100 with the following options to ccmake: .. code-block:: bash - -DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_VOLTA70=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF + -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON While Ampere A100 GPUs are enabled with .. code-block:: bash - -DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_AMPERE80=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF + -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE80=ON + +And for H100 GPUS: + +.. code-block:: bash + + -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_HOPPER90=ON + + +MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual. + + +.. warning:: + + As of *Idefix* 2.1.02, we automatically disable Cuda Malloc async (``-DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF``). However, earlier versions of + *Idefix* requires this flag when calling cmake to prevent a bug when using PSM2 with async Cuda malloc possibly leading to openmpi crash or hangs on Jean Zay. + -MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual. The malloc async option is here to prevent a bug when using PSM2 with async -Cuda malloc possibly leading to openmpi crash or hangs on Jean Zay. .. _setupSpecificOptions: