Skip to content

Commit

Permalink
Address CUDA MPI/ICP issue with Kokkos <=4.4.1
Browse files Browse the repository at this point in the history
  • Loading branch information
pgrete committed Oct 15, 2024
1 parent 1e77c13 commit 53ccbcf
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@

### Changed (changing behavior/API/variables/...)
- [[PR 1187]](https://github.com/parthenon-hpc-lab/parthenon/pull/1187) Make DataCollection::Add safer and generalize MeshBlockData::Initialize
- [[PR 1186]](https://github.com/parthenon-hpc-lab/parthenon/pull/1186) Bump Kokkos submodule to 4.4.1
- [[Issue 1165]](https://github.com/parthenon-hpc-lab/parthenon/issues/1165) Bump Kokkos submodule to 4.4.1
- [[PR 1171]](https://github.com/parthenon-hpc-lab/parthenon/pull/1171) Add PARTHENON_USE_SYSTEM_PACKAGES build option
- [[PR 1172]](https://github.com/parthenon-hpc-lab/parthenon/pull/1172) Make parthenon manager robust against external MPI init and finalize calls

### Fixed (not changing behavior/API/variables/...)
- [[PR 1189]](https://github.com/parthenon-hpc-lab/parthenon/pull/1189) Address CUDA MPI/ICP issue with Kokkos <=4.4.1
- [[PR 1178]](https://github.com/parthenon-hpc-lab/parthenon/pull/1178) Fix issue with mesh pointer when using relative residual tolerance in BiCGSTAB solver.
- [[PR 1173]](https://github.com/parthenon-hpc-lab/parthenon/pull/1173) Make debugging easier by making parthenon throw an error if ParameterInput is different on multiple MPI ranks.

Expand Down
7 changes: 7 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,13 @@ if (Kokkos_ENABLE_CUDA)
if(CHECK_REGISTRY_PRESSURE)
add_compile_options(-Xptxas=-v)
endif()

# Async malloc are enabled by default until Kokkos <= 4.4.1 but
# cause issues with IPCs and MPI,
# see https://github.com/parthenon-hpc-lab/athenapk/pull/114#issuecomment-2358981857
# Given that it's also not expected to be a significant performance gain
# it's hard disabled now.
set(Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC OFF CACHE BOOL "Disable Cuda async malloc (to address MPI/IPC issues)")
endif()
# Note that these options may not play nice with nvcc wrapper
if (TEST_INTEL_OPTIMIZATION)
Expand Down

0 comments on commit 53ccbcf

Please sign in to comment.