Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMPI vs MPICH performance issue #7188

Open
SnaKyEyeS opened this issue Oct 23, 2024 · 1 comment
Open

OpenMPI vs MPICH performance issue #7188

SnaKyEyeS opened this issue Oct 23, 2024 · 1 comment
Assignees

Comments

@SnaKyEyeS
Copy link

SnaKyEyeS commented Oct 23, 2024

Hello,

While investigating MPI performances on Lucia with @thomasgillis, we found that MPICH was a fair amount slower than OpenMPI. The testcase here is a simple ping-pong (device to device) between two Nvidia GPUs on distinct nodes, with increasing message size. Also shown for further comparison are the results of fi_bw (from fabtests).

MPICH version is 4.2.3 compiled with libfabric v1.22.0, and we set the following environment variables for MPICH's run:

export FI_PROVIDER="verbs;ofi_rxm"
export FI_HMEM_CUDA_USE_GDRCOPY=1
export FI_OFI_RXM_BUFFER_SIZE=256
export FI_OFI_RXM_SAR_LIMIT=256
export MPIR_CVAR_CH4_OFI_ENABLE_HMEM=1


@yfguo
Copy link
Contributor

yfguo commented Oct 24, 2024

Thanks for reporting. I will look into the performance issue.

@yfguo yfguo self-assigned this Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants