- Select fabrics
- cat /etc/dat.conf
- export I_MPI_FABRICS=shm:dapl
- I_MPI_ADJUST
- In mpirun command, use
-genv I_MPI_DEBUG 5
to print debug information
- ompi_info --display-map to display information of the current openmpi
- ibstat
- In order to check nv_peer_mem is loaded:
- $ service nv_peer_mem status
- nv_peer_mem module is loaded.
- Testing CUDA latency
- Ref: http://www.rdmamojo.com/2015/01/24/verify-rdma-working/
- command
- ibv_devices
- lsmod |grep rdm
- ibv_devinfo -d mlx5_0
- Test
- hostA: ib_send_bw -d mlx5_0 -i 1 -F --report_gbits
- hostB: ib_send_bw -d mlx5_0 -i 1 -F --report_gbits
- Open Fabric info
- ofed_info
- Hangs in CENTOS7.3
- Requirement for glibc >=2.19 but runs in CentOS7.4 which has 2.17
- Check: NVIDIA/nccl#19 (comment)
- NVIDIA/nccl#19 (comment)
- pci setup might be necessary
sudo lspci | grep PLX
sudo lspci -vvv | grep ACSCtl
sudo setpci -s 03:00.0 f2a.w=0000
sudo setpci -s 04:08.0 f2a.w=0000
- https://github.com/NVIDIA/nccl-tests
- Setup NCC_HOME, MPI_HOME and build using make
- ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
- mpirun -n 1 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
- This uses 4 gpus per node
- ifconfig is deprecated. Use
ip a
- IB's ip is different than ethernet ip