Skip to content

Commit

Permalink
Merge branch 'fixConfigDoc' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
glesur committed Oct 24, 2024
2 parents a2a0189 + 46d1973 commit 4267c4f
Showing 1 changed file with 32 additions and 8 deletions.
40 changes: 32 additions & 8 deletions doc/source/reference/makefile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,32 +130,56 @@ Finally, *Idefix* can be configured to run on Mi250 by enabling HIP and the desi
MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual.

Jean Zay at IDRIS, Nvidia V100 and A100 GPUs
--------------------------------------------
Jean Zay at IDRIS, Nvidia V100/A100/H100 GPUs
---------------------------------------------

We recommend the following modules and environement variables on Jean Zay:
We recommend the following modules and environement variables on Jean Zay V100/A100:

.. code-block:: bash
module load arch/a100 # ONLY forA100
module load cuda/12.1.0
module load gcc/12.2.0
module load openmpi/4.1.1-cuda
module load cmake/3.18.0
module load cmake/3.25.2
While for H100:

.. code-block:: bash
module load arch/h100
module load cmake/3.30.1
module load cuda/12.1.0
module load openmpi/4.1.5-cuda
*Idefix* can then be configured to run on Nvidia V100 with the following options to ccmake:

.. code-block:: bash
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_VOLTA70=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF
-DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON
While Ampere A100 GPUs are enabled with

.. code-block:: bash
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_AMPERE80=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF
-DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE80=ON
And for H100 GPUS:

.. code-block:: bash
-DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_HOPPER90=ON
MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual.


.. warning::

As of *Idefix* 2.1.02, we automatically disable Cuda Malloc async (``-DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF``). However, earlier versions of
*Idefix* requires this flag when calling cmake to prevent a bug when using PSM2 with async Cuda malloc possibly leading to openmpi crash or hangs on Jean Zay.


MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual. The malloc async option is here to prevent a bug when using PSM2 with async
Cuda malloc possibly leading to openmpi crash or hangs on Jean Zay.

.. _setupSpecificOptions:

Expand Down

0 comments on commit 4267c4f

Please sign in to comment.