Skip to content

Commit

Permalink
Leapfrog fix (parthenon-hpc-lab#1206)
Browse files Browse the repository at this point in the history
* Missing send size init

* cleanup, CHANGELOG

* verbose CI

* further CI debugging

* This should be working...

* This should be fixed... but I get a segfault on GPU

* Is it my AMD GPU thats wrong?

* Missing a return statement

* retest

* Oops missing statement

* Revert test

* revert workflow
  • Loading branch information
brryan authored and acreyes committed Nov 8, 2024
1 parent 0cbbdac commit 01517f7
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
- [[PR 1161]](https://github.com/parthenon-hpc-lab/parthenon/pull/1161) Make flux field Metadata accessible, add Metadata::CellMemAligned flag, small perfomance upgrades

### Changed (changing behavior/API/variables/...)
- [[PR 1206]](https://github.com/parthenon-hpc-lab/parthenon/pull/1206) Leapfrog fix
- [[PR1203]](https://github.com/parthenon-hpc-lab/parthenon/pull/1203) Pin Ubuntu CI image
- [[PR1177]](https://github.com/parthenon-hpc-lab/parthenon/pull/1177) Make mesh-level boundary conditions usable without the "user" flag
- [[PR 1187]](https://github.com/parthenon-hpc-lab/parthenon/pull/1187) Make DataCollection::Add safer and generalize MeshBlockData::Initialize
Expand Down
5 changes: 5 additions & 0 deletions src/interface/swarm_comms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,11 @@ void Swarm::LoadBuffers_() {

// Remove particles that were loaded to send to another block from this block
RemoveMarkedParticles();
} else {
for (int n = 0; n < pmb->neighbors.size(); n++) {
const int bufid = pmb->neighbors[n].bufid;
vbswarm->send_size[bufid] = 0;
}
}
}

Expand Down
17 changes: 15 additions & 2 deletions src/utils/sort.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ void sort(ParArray1D<Key> data, KeyComparator comparator, size_t min_idx,
size_t max_idx) {
PARTHENON_DEBUG_REQUIRE(min_idx < data.extent(0), "Invalid minimum sort index!");
PARTHENON_DEBUG_REQUIRE(max_idx < data.extent(0), "Invalid maximum sort index!");
#ifdef KOKKOS_ENABLE_CUDA
#if defined(KOKKOS_ENABLE_CUDA)
#ifdef __clang__
PARTHENON_FAIL("sort is using thrust and there exists an incompatibility with clang, "
"see https://github.com/lanl/parthenon/issues/647 for more details. We "
Expand All @@ -74,6 +74,13 @@ void sort(ParArray1D<Key> data, KeyComparator comparator, size_t min_idx,
thrust::device_ptr<Key> last_d = thrust::device_pointer_cast(data.data()) + max_idx + 1;
thrust::sort(first_d, last_d, comparator);
#endif
#elif defined(KOKKOS_ENABLE_HIP)
auto data_h = Kokkos::create_mirror_view_and_copy(HostMemSpace(), data);
std::sort(data_h.data() + min_idx, data_h.data() + max_idx + 1, comparator);
Kokkos::deep_copy(data, data_h);
// TODO(BRR) With Kokkos 4.4, switch to Kokkos::sort
// auto sub_data = Kokkos::subview(data, std::make_pair(min_idx, max_idx + 1));
// Kokkos::sort(sub_data, comparator);
#else
if (std::is_same<DevExecSpace, HostExecSpace>::value) {
std::sort(data.data() + min_idx, data.data() + max_idx + 1, comparator);
Expand All @@ -89,7 +96,7 @@ template <class Key>
void sort(ParArray1D<Key> data, size_t min_idx, size_t max_idx) {
PARTHENON_DEBUG_REQUIRE(min_idx < data.extent(0), "Invalid minimum sort index!");
PARTHENON_DEBUG_REQUIRE(max_idx < data.extent(0), "Invalid maximum sort index!");
#ifdef KOKKOS_ENABLE_CUDA
#if defined(KOKKOS_ENABLE_CUDA) || defined(KOKKOS_ENABLE_HIP)
#ifdef __clang__
PARTHENON_FAIL("sort is using thrust and there exists an incompatibility with clang, "
"see https://github.com/lanl/parthenon/issues/647 for more details. We "
Expand All @@ -102,6 +109,12 @@ void sort(ParArray1D<Key> data, size_t min_idx, size_t max_idx) {
thrust::device_ptr<Key> last_d = thrust::device_pointer_cast(data.data()) + max_idx + 1;
thrust::sort(first_d, last_d);
#endif
auto data_h = Kokkos::create_mirror_view_and_copy(HostMemSpace(), data);
std::sort(data_h.data() + min_idx, data_h.data() + max_idx + 1);
Kokkos::deep_copy(data, data_h);
// TODO(BRR) With Kokkos 4.4, switch to Kokkos::sort
// auto sub_data = Kokkos::subview(data, std::make_pair(min_idx, max_idx + 1));
// Kokkos::sort(sub_data);
#else
if (std::is_same<DevExecSpace, HostExecSpace>::value) {
std::sort(data.data() + min_idx, data.data() + max_idx + 1);
Expand Down

0 comments on commit 01517f7

Please sign in to comment.