Skip to content
Evan Weinberg edited this page Jun 7, 2022 · 8 revisions

Instructions last verified on June 7, 2022. Since Perlmutter is still a preproduction series these instructions may need to change at any time. Please contact us on the QUDA slack if they do not work

Environment

Due to the Cray MPI wrappers, some care is needed to set up a build environment and help QUDA's cmake build (and MILC's Makefile) properly find MPI. The following environment will load CUDA 11.5, gcc 11.2.0, and cmake 3.22, plus set other useful environment variables:

module purge
module load PrgEnv-gnu
module load cmake
module load cudatoolkit
module load craype-accel-nvidia80
export MPICH_GPU_SUPPORT_ENABLED=1
export CRAY_CPU_TARGET=x86-64

export CC=$(which cc)
export CXX=$(which CC)

export MPI_HOME=$MPICH_DIR
export MPI_CXX_COMPILER=$(which CC)
export MPI_CXX_COMPILER_FLAGS=$(CC --cray-print-opts=all)

Building

QUDA

With the previous environment variables in place, compiling QUDA is relatively straightforward. A reference QUDA installation that automatically downloads+builds QMP plus QIO, and includes the necessary bits to be used with MILC, is:

WORKING_DIRECTORY=$(pwd)
git clone --branch develop https://github.com/lattice/quda && mkdir build && cd build
cmake \
        -DCMAKE_BUILD_TYPE=RELEASE \
        -DQUDA_GPU_ARCH=sm_80 \
        -DQUDA_DIRAC_DEFAULT_OFF=ON \
        -DQUDA_DIRAC_STAGGERED=ON \
        -DQUDA_QMP=ON \
        -DQUDA_QIO=ON \
        -DQUDA_DOWNLOAD_USQCD=ON \
        ../quda
make -j install
cd $WORKING_DIRECTORY

MILC with QUDA

The MILC+QUDA helper scripts that come with MILC currently need to be modified to work on Perlmutter. For simplicity, we include raw commands below, which will be updated once the compile_* scripts have been updated.

MILC can be downloaded as

git clone --branch develop https://github.com/milc-qcd/milc_qcd

Compiling MILC with QMP+QIO+QUDA requires finding the CUDA path, as well as the directories to the QMP+QIO+QUDA installs. These can be found via

# Automated method to find the path to CUDA
PATH_TO_CUDA=$(which nvcc)
PATH_TO_CUDA=${PATH_TO_CUDA/\bin\/nvcc/}

# Paths to QUDA, QIO, QMP
PATH_TO_QUDA="${WORKING_DIRECTORY}/build/usqcd"
PATH_TO_QMP=$PATH_TO_QUDA
PATH_TO_QIO=$PATH_TO_QUDA

MILC RHMC

MILC RHMC can be compiled from the Makefile as:

> cd ${WORKING_DIRECTORY}/milc_qcd/ks_imp_rhmc
> cp ../Makefile .
> MY_CC=cc \
  MY_CXX=CC \
  CUDA_HOME=${PATH_TO_CUDA} \
  QUDA_HOME=${PATH_TO_QUDA} \
  WANTQUDA=true \
  WANT_FN_CG_GPU=true \
  WANT_FL_GPU=true \
  WANT_GF_GPU=true \
  WANT_FF_GPU=true \
  WANT_MIXED_PRECISION_GPU=2 \
  PRECISION=2 \
  MPP=true \
  OMP=true \
  WANTQIO=true \
  WANTQMP=true \
  QIOPAR=${PATH_TO_QIO} \
  QMPPAR=${PATH_TO_QMP} \
  PATH_TO_NVHPCSDK="" \
  make -j 1 su3_rhmd_hisq

MILC Spectrum Measurements

The MILC spectrum measurement executable can be built as:

> cd ${WORKING_DIRECTORY}/milc_qcd/ks_spectrum
> cp ../Makefile .
> MY_CC=cc \
  MY_CXX=CC \
  CUDA_HOME=${PATH_TO_CUDA} \
  QUDA_HOME=${PATH_TO_QUDA} \
  WANTQUDA=true \
  WANT_FN_CG_GPU=true \
  WANT_FL_GPU=true \
  WANT_GF_GPU=true \
  WANT_FF_GPU=true \
  WANT_MIXED_PRECISION_GPU=2 \
  PRECISION=2 \
  MPP=true \
  OMP=true \
  WANTQIO=true \
  WANTQMP=true \
  QIOPAR=${PATH_TO_QIO} \
  QMPPAR=${PATH_TO_QMP} \
  PATH_TO_NVHPCSDK="" \
  CGEOM="-DFIX_NODE_GEOM -DFIX_IONODE_GEOM" \
  KSCGMULTI="-DKS_MULTICG=HYBRID -DMULTISOURCE" \
  make -j 1 ks_spectrum_hisq

Running

Cray MPI requires various environment variable flags to run. These are subject to change, but for now a viable set is:

export QUDA_ENABLE_GDR=1
export MPICH_RDMA_ENABLED_CUDA=1
export MPICH_GPU_SUPPORT_ENABLED=1
export MPICH_NEMESIS_ASYNC_PROGRESS=1

export OMP_NUM_THREADS=16
export SLURM_CPU_BIND=cores
export CRAY_ACCEL_TARGET=nvidia80

Running in an interactive node

An interactive node on Perlmutter, after adding your account, can be acquired via:

salloc  -A m[####]_g -C gpu -t 20 -N 1 --tasks-per-node 4 --gpus 4 --qos interactive

Make sure that your environment matches the environment defined on the top of this page. Further, the USQCD library paths should be added to your LD_LIBRARY_PATH when running MILC. The USQCD libraries exist in [path to QUDA build]/usqcd/lib.

QUDA's test executables can be run from the interactive node via srun. A reference command for staggered_invert_test on 1 GPU is straightforward and given by:

srun -n 1 ./staggered_invert_test

Likewise, a 4 GPU run with a 1x1x2x2 decomposition is given by:

srun -n 4 ./staggered_invert_test --gridsize 1 1 2 2

Submitting a SLURM script

WIP

Clone this wiki locally