Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coalesced Buffer Communication #1192

Open
wants to merge 117 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
539ba5f
start on combined communication
lroberts36 Oct 17, 2024
d1c1274
just reuse BndInfo
lroberts36 Oct 17, 2024
e42ee36
partial
lroberts36 Oct 17, 2024
40a0a02
cleanup serialization, decouple
lroberts36 Oct 18, 2024
00ce27b
missed on last commit
lroberts36 Oct 18, 2024
64d655e
fix bug
lroberts36 Oct 19, 2024
c3ddf52
Actually set up to do communication
lroberts36 Oct 19, 2024
74f9c33
actually add the communication
lroberts36 Oct 19, 2024
ee04547
split into cpp file
lroberts36 Oct 19, 2024
cc14c89
format
lroberts36 Oct 19, 2024
295e8a3
working mpi communication
lroberts36 Oct 19, 2024
d8fbd65
pull out and store buffers
lroberts36 Oct 19, 2024
566f36d
fix serial builds
lroberts36 Oct 19, 2024
8a9a1bc
be a little more careful
lroberts36 Oct 21, 2024
75667ab
Set things up for communication
lroberts36 Oct 21, 2024
d56bc78
Make functions avilable on device
lroberts36 Oct 21, 2024
a35fccf
Add untested PackAndSend
lroberts36 Oct 21, 2024
6436fdc
Add receive and unpack
lroberts36 Oct 21, 2024
5fd8de5
Receive everything
lroberts36 Oct 21, 2024
95db032
compiles
lroberts36 Oct 21, 2024
9c4010f
small name change
lroberts36 Oct 21, 2024
b4efb3d
segfault
lroberts36 Oct 21, 2024
995913e
correctly point to send buffers
lroberts36 Oct 21, 2024
160c77f
allow explicit staling of send buffers
lroberts36 Oct 21, 2024
6355185
taking a few steps
lroberts36 Oct 22, 2024
12df3b6
switch to reference symantics
lroberts36 Oct 22, 2024
b55659c
remove print statements
lroberts36 Oct 22, 2024
1be047d
clear the combined buffers after remesh
lroberts36 Oct 22, 2024
1b405dc
some other debugging stuff
lroberts36 Oct 22, 2024
7f5b944
fix bug
lroberts36 Oct 22, 2024
b0dd208
format and lint
lroberts36 Oct 22, 2024
d8ae6e8
small
lroberts36 Oct 22, 2024
d0d0194
small part 2
lroberts36 Oct 22, 2024
6bafb0a
small part 3
lroberts36 Oct 22, 2024
07f62e2
format
lroberts36 Oct 22, 2024
809dcb1
Fix on vista
Oct 23, 2024
43c376b
format and lint
lroberts36 Oct 23, 2024
505426d
Update src/bvals/comms/combined_buffers.hpp
lroberts36 Oct 23, 2024
e60b82c
Update src/bvals/comms/combined_buffers.cpp
lroberts36 Oct 23, 2024
4d49ce5
use separate comm
lroberts36 Oct 23, 2024
ee58f4d
save other communication mechanism
lroberts36 Oct 24, 2024
59c8680
add a barrier at the end of ReceiveBoundBufs
lroberts36 Oct 24, 2024
dc54426
fix reallocation issue
lroberts36 Oct 24, 2024
b823b74
Make things work with AMR and flux correction
lroberts36 Oct 24, 2024
5649581
pre check small buffers for staleness
lroberts36 Oct 24, 2024
17f25e0
format and lint
lroberts36 Oct 24, 2024
cff847b
Add some debugging code
lroberts36 Oct 30, 2024
8cc982c
implement a number of different receive strategies and use issend
lroberts36 Oct 30, 2024
72ec3ac
format and lint
lroberts36 Oct 30, 2024
0ff61c0
remove extra iterations
lroberts36 Oct 30, 2024
f5f7bba
remove MPI_BARRIER
lroberts36 Oct 30, 2024
7c320cb
clear message buffer
lroberts36 Oct 31, 2024
20e9765
remove unused stuff
lroberts36 Oct 31, 2024
977b2a3
working side by side but not using new stuff
lroberts36 Nov 1, 2024
fe8a2af
working with new split
lroberts36 Nov 1, 2024
2e35467
removed extra junk
lroberts36 Nov 1, 2024
7c80de1
format and lint
lroberts36 Nov 1, 2024
ec10ab2
add line
lroberts36 Nov 1, 2024
675895e
compile w/o mpi
lroberts36 Nov 1, 2024
9bb4747
remov mesh passing
lroberts36 Nov 1, 2024
cef604f
format and lint
lroberts36 Nov 1, 2024
655dc39
fix non-mpi compilation
lroberts36 Nov 1, 2024
d786b86
start working to pass around var ids
lroberts36 Nov 1, 2024
ece20a3
pass MeshData
lroberts36 Nov 1, 2024
4116362
almost there...
lroberts36 Nov 1, 2024
25c2c51
use the MeshData uids
lroberts36 Nov 1, 2024
d7ba65d
Working with subsets, no cacheing
lroberts36 Nov 1, 2024
91771ad
format
lroberts36 Nov 1, 2024
5cddc0d
start on cacheing
lroberts36 Nov 4, 2024
207c2dc
include allocation status in output
lroberts36 Nov 4, 2024
ac01d92
sparse maybe working
lroberts36 Nov 4, 2024
fa53f72
add comm switch
lroberts36 Nov 4, 2024
24aa55f
fix logic
lroberts36 Nov 4, 2024
f72ee8f
format and lint
lroberts36 Nov 4, 2024
a194eba
Check that send buffers are completed before deleting
lroberts36 Nov 5, 2024
3b21bbc
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 12, 2024
8670107
don't start comms
lroberts36 Nov 13, 2024
f2d8c0c
some more stuff that doesn't work
lroberts36 Nov 13, 2024
deeccf4
small
lroberts36 Nov 13, 2024
c22cf96
regular Isend
lroberts36 Nov 13, 2024
d5d4328
don't require all received
lroberts36 Nov 13, 2024
255e85a
format
lroberts36 Nov 14, 2024
3941117
Add documentation
lroberts36 Nov 14, 2024
9159c93
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 14, 2024
52778c5
small doc
lroberts36 Nov 15, 2024
451b246
rename to coalesced
lroberts36 Nov 15, 2024
43c4efa
split things up
lroberts36 Nov 15, 2024
5ff9c60
comment
lroberts36 Nov 15, 2024
3adb853
some renaming
lroberts36 Nov 15, 2024
06231e1
rename
lroberts36 Nov 15, 2024
b14432f
reanme again
lroberts36 Nov 15, 2024
24a9733
one line to fix everything
lroberts36 Nov 18, 2024
1e5e4d4
format
lroberts36 Nov 18, 2024
1f1c5f3
default to coalesced comms
lroberts36 Nov 18, 2024
0c3131e
cache different var sets
lroberts36 Nov 18, 2024
08d9c13
allocate combined buffers only as needed
lroberts36 Nov 19, 2024
a694074
changelog
lroberts36 Nov 19, 2024
3353b52
small
lroberts36 Nov 19, 2024
408ecb3
copyright year
lroberts36 Nov 19, 2024
06bd5d3
fix a couple of things
lroberts36 Nov 19, 2024
0bd053e
remove unused
lroberts36 Nov 19, 2024
0183590
remove comment
lroberts36 Nov 19, 2024
975ce4a
oops
lroberts36 Nov 19, 2024
867af0a
skip non-communicated variables
lroberts36 Nov 21, 2024
b573337
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 25, 2024
5648a82
Update doc/sphinx/src/boundary_communication.rst
lroberts36 Nov 25, 2024
1bcfdba
Update doc/sphinx/src/boundary_communication.rst
lroberts36 Nov 25, 2024
7edde1e
address brryan comment
lroberts36 Nov 25, 2024
5fefb71
doc at brryan's suggestion
lroberts36 Nov 25, 2024
9ae28c8
remove commented outlines
lroberts36 Nov 25, 2024
e95f691
do view of views correctly for Kokkos 4.5.1
lroberts36 Nov 26, 2024
c67a5fb
act on a bunch of small comments
lroberts36 Nov 27, 2024
6aa7dcd
move functions and add max_iters to clear
lroberts36 Nov 27, 2024
0f3795d
fix buffer bugs?
lroberts36 Nov 28, 2024
22ecb2e
no need to check if not doing coalesced
lroberts36 Nov 28, 2024
822a859
Remove deprecated note
lroberts36 Nov 28, 2024
8739373
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
Yurlungur Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ add_library(parthenon
bvals/comms/bnd_info.cpp
bvals/comms/bnd_info.hpp
bvals/comms/boundary_communication.cpp
bvals/comms/combined_buffers.cpp
bvals/comms/combined_buffers.hpp
bvals/comms/tag_map.cpp
bvals/comms/tag_map.hpp

Expand Down
25 changes: 22 additions & 3 deletions src/bvals/comms/bnd_info.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@

#include "basic_types.hpp"
#include "bvals/comms/bnd_info.hpp"
#include "bvals/comms/bvals_utils.hpp"
#include "bvals/neighbor_block.hpp"
#include "config.hpp"
#include "globals.hpp"
Expand Down Expand Up @@ -251,7 +252,7 @@ CalcIndices(const NeighborBlock &nb, MeshBlock *pmb,
{s[2], e[2]}, {s[1], e[1]}, {s[0], e[0]});
}

int GetBufferSize(MeshBlock *pmb, const NeighborBlock &nb,
int GetBufferSize(const MeshBlock *const pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v) {
// This does not do a careful job of calculating the buffer size, in many
// cases there will be some extra storage that is not required, but there
Expand All @@ -277,7 +278,7 @@ BndInfo::BndInfo(MeshBlock *pmb, const NeighborBlock &nb,
allocated = v->IsAllocated();
alloc_status = v->GetAllocationStatus();

buf = combuf->buffer();
if (combuf != nullptr) buf = combuf->buffer();
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
same_to_same = pmb->gid == nb.gid && nb.offsets.IsCell();
lcoord_trans = nb.lcoord_trans;
if (!allocated) return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can/should we go past this point now?
Was this related to the bug with the buffer size being 0 on first pass?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously, I just bailed here because there was no point in doing the extra index range calculations. As you note, this is what caused the buffer size 0 on the first pass bug. Removing it shouldn't impact any behavior in pre-existing code and I doubt it had any noticeable performance impact.

Expand Down Expand Up @@ -305,6 +306,24 @@ BndInfo::BndInfo(MeshBlock *pmb, const NeighborBlock &nb,
}
}

BndId BndId::GetSend(MeshBlock *pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v, BoundaryType b_type,
int partition, int start_idx) {
auto [send_gid, recv_gid, vlabel, loc, extra_id] = SendKey(pmb, nb, v, b_type);
BndId out;
out.send_gid() = send_gid;
out.recv_gid() = recv_gid;
out.loc_idx() = loc;
out.var_id() = v->GetUniqueID();
out.extra_id() = extra_id;
out.rank_send() = Globals::my_rank;
out.rank_recv() = nb.rank;
out.partition() = partition;
out.size() = BndInfo::GetSendBndInfo(pmb, nb, v, nullptr).size();
out.start_idx() = start_idx;
return out;
}

BndInfo BndInfo::GetSendBndInfo(MeshBlock *pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v,
CommBuffer<buf_pool_t<Real>::owner_t> *buf) {
Expand All @@ -326,7 +345,7 @@ BndInfo BndInfo::GetSetBndInfo(MeshBlock *pmb, const NeighborBlock &nb,
if (nb.offsets.IsCell()) idx_range_type = IndexRangeType::InteriorRecv;
BndInfo out(pmb, nb, v, buf, idx_range_type);

auto buf_state = buf->GetState();
auto buf_state = buf != nullptr ? buf->GetState() : BufferState::received;
if (buf_state == BufferState::received) {
out.buf_allocated = true;
} else if (buf_state == BufferState::received_null) {
Expand Down
56 changes: 55 additions & 1 deletion src/bvals/comms/bnd_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,67 @@ enum class IndexRangeType {
InteriorRecv
};

struct BndId {
constexpr static std::size_t NDAT = 10;
int data[NDAT];

// Information for identifying the buffer with a communication
// channel, variable, and the ranks it is communicated across
int &send_gid() { return data[0]; }
int &recv_gid() { return data[1]; }
int &loc_idx() { return data[2]; }
int &var_id() { return data[3]; }
int &extra_id() { return data[4]; }
int &rank_send() { return data[5]; }
int &rank_recv() { return data[6]; }
BoundaryType bound_type;

// MeshData partition id of the *sender*
// not set by constructors and only necessary for coalesced comms
int &partition() { return data[7]; }
int &size() { return data[8]; }
int &start_idx() { return data[9]; }

CommBuffer<buf_pool_t<Real>::weak_t> buf; // comm buffer from pool

KOKKOS_DEFAULTED_FUNCTION
BndId() = default;
KOKKOS_DEFAULTED_FUNCTION
BndId(const BndId &) = default;

explicit BndId(const int *const data_in) {
for (int i = 0; i < NDAT; ++i) {
data[i] = data_in[i];
}
}

void Serialize(int *data_out) {
for (int i = 0; i < NDAT; ++i) {
data_out[i] = data[i];
}
}

static BndId GetSend(MeshBlock *pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v, BoundaryType b_type,
int partition, int start_idx);
};

struct BndInfo {
int ntopological_elements = 1;
using TE = TopologicalElement;
TE topo_idx[3]{TE::CC, TE::CC, TE::CC};
SpatiallyMaskedIndexer6D idxer[3];
forest::LogicalCoordinateTransformation lcoord_trans;

KOKKOS_FORCEINLINE_FUNCTION
int size() const {
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
int s = 0;
for (int n = 0; n < ntopological_elements; ++n) {
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
s += idxer[n].size();
}
return s;
}

CoordinateDirection dir{CoordinateDirection::X0DIR};
bool allocated = true;
bool buf_allocated = true;
Expand Down Expand Up @@ -124,7 +178,7 @@ struct ProResInfo {
std::shared_ptr<Variable<Real>> v);
};

int GetBufferSize(MeshBlock *pmb, const NeighborBlock &nb,
int GetBufferSize(const MeshBlock *const pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v);

using BndInfoArr_t = ParArray1D<BndInfo>;
Expand Down
9 changes: 9 additions & 0 deletions src/bvals/comms/boundary_communication.cpp
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,15 @@ TaskStatus SendBoundBufs(std::shared_ptr<MeshData<Real>> &md) {
sending_nonzero_flags(b) = non_zero[0] || non_zero[1] || non_zero[2];
});
});
// 1. Parallel scan per rank to get the starting indices of the buffers

// 2. Check the size of the buffer (how do you do this without extra DtoH call?) and
// possibly allocate more storage
// [Alternatively could just allocate to maximal size initially]

// 3. Pack the combined buffers

// 4. Send the combined buffers
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved

// Send buffers
if (Globals::sparse_config.enabled)
Expand Down
10 changes: 9 additions & 1 deletion src/bvals/comms/build_boundary_buffers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@

#include "bvals_in_one.hpp"
#include "bvals_utils.hpp"
#include "combined_buffers.hpp"
#include "config.hpp"
#include "globals.hpp"
#include "interface/variable.hpp"
Expand Down Expand Up @@ -110,19 +111,25 @@ void BuildBoundaryBufferSubset(std::shared_ptr<MeshData<Real>> &md,
tag = pmesh->tag_map.GetTag(pmb, nb);
auto comm_label = v->label();
mpi_comm_t comm = pmesh->GetMPIComm(comm_label);

#else
// Setting to zero is fine here since this doesn't actually get used when everything
// is on the same rank
mpi_comm_t comm = 0;
#endif

bool use_sparse_buffers = v->IsSet(Metadata::Sparse);
auto get_resource_method = [pmesh, buf_size]() {
auto get_resource_method = [pmesh, buf_size](int size) {
PARTHENON_REQUIRE(size <= buf_size,
"Asking for a buffer that is larger than size of pool.");
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved
return buf_pool_t<Real>::owner_t(pmesh->pool_map.at(buf_size).Get());
};

// Build send buffer (unless this is a receiving flux boundary)
if constexpr (IsSender(BTYPE)) {
// Register this buffer with the combined buffers
if (receiver_rank != sender_rank)
pmesh->pcombined_buffers->AddSendBuffer(md->partition, pmb, nb, v, BTYPE);
auto s_key = SendKey(pmb, nb, v, BTYPE);
if (buf_map.count(s_key) == 0)
buf_map[s_key] = CommBuffer<buf_pool_t<Real>::owner_t>(
Expand All @@ -133,6 +140,7 @@ void BuildBoundaryBufferSubset(std::shared_ptr<MeshData<Real>> &md,
// Also build the non-local receive buffers here
if constexpr (IsReceiver(BTYPE)) {
if (sender_rank != receiver_rank) {
pmesh->pcombined_buffers->AddRecvBuffer(pmb, nb, v, BTYPE);
auto r_key = ReceiveKey(pmb, nb, v, BTYPE);
if (buf_map.count(r_key) == 0)
buf_map[r_key] = CommBuffer<buf_pool_t<Real>::owner_t>(
Expand Down
5 changes: 5 additions & 0 deletions src/bvals/comms/bvals_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ inline Mesh::channel_key_t ReceiveKey(const MeshBlock *pmb, const NeighborBlock
return {sender_id, receiver_id, pcv->label(), location_idx, other};
}

inline Mesh::channel_key_t GetChannelKey(BndId &in) {
return {in.send_gid(), in.recv_gid(), Variable<Real>::GetLabel(in.var_id()),
in.loc_idx(), in.extra_id()};
}

// Build a vector of pointers to all of the sending or receiving communication buffers on
// MeshData md. This cache is important for performance, since this elides a map look up
// for the buffer every time the bvals code iterates over boundaries.
Expand Down
128 changes: 128 additions & 0 deletions src/bvals/comms/combined_buffers.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
//========================================================================================
// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved.
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
//
// This program was produced under U.S. Government contract 89233218CNA000001 for Los
// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC
// for the U.S. Department of Energy/National Nuclear Security Administration. All rights
// in the program are reserved by Triad National Security, LLC, and the U.S. Department
// of Energy/National Nuclear Security Administration. The Government is granted for
// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide
// license in this material to reproduce, prepare derivative works, distribute copies to
// the public, perform publicly and display publicly, and to permit others to do so.
//========================================================================================
#include <map>
#include <memory>
#include <string>
#include <utility>
#include <vector>

#include "basic_types.hpp"
#include "bvals/comms/bvals_utils.hpp"
#include "bvals/comms/combined_buffers.hpp"
#include "bvals/neighbor_block.hpp"
#include "coordinates/coordinates.hpp"
#include "interface/variable.hpp"
#include "mesh/mesh.hpp"
#include "mesh/meshblock.hpp"
#include "utils/communication_buffer.hpp"

namespace parthenon {

CombinedBuffersRank::CombinedBuffersRank(int o_rank, BoundaryType b_type, bool send)
: other_rank(o_rank), sender(send), buffers_built(false) {
if (sender) {
message = com_buf_t(1234, Globals::my_rank, other_rank, comm_,
[](int size) { return std::vector<int>(size); });
} else {
message = com_buf_t(
1234, other_rank, Globals::my_rank, comm_,
[](int size) { return std::vector<int>(size); }, true);
}
PARTHENON_REQUIRE(other_rank != Globals::my_rank, "Should only build for other ranks.");
}

void CombinedBuffersRank::AddSendBuffer(int partition, MeshBlock *pmb,
const NeighborBlock &nb,
const std::shared_ptr<Variable<Real>> &var,
BoundaryType b_type) {
if (current_size.count(partition) == 0) current_size[partition] = 0;
auto &cur_size = current_size[partition];
combined_info[partition].push_back(
BndId::GetSend(pmb, nb, var, b_type, partition, cur_size));
cur_size += combined_info[partition].back().size();
}

bool CombinedBuffersRank::TryReceiveBufInfo(Mesh *pmesh) {
PARTHENON_REQUIRE(!sender, "Trying to receive on a combined sender.");
if (buffers_built) return buffers_built;

bool received = message.TryReceive();
if (received) {
auto &mess_buf = message.buffer();
int npartitions = mess_buf[0];
// Unpack into per combined buffer information
int idx{nglobal};
for (int p = 0; p < npartitions; ++p) {
const int partition = mess_buf[idx++];
const int nbuf = mess_buf[idx++];
const int total_size = mess_buf[idx++];
combined_buffers[partition] = buf_t("combined recv buffer", total_size);
auto &cr_info = combined_info[partition];
for (int b = 0; b < nbuf; ++b) {
cr_info.emplace_back(&(mess_buf[idx]));
auto &buf = cr_info.back();
// Store the buffer
buf.buf = pmesh->boundary_comm_map[GetChannelKey(buf)];
idx += BndId::NDAT;
}
}
message.Stale();
buffers_built = true;
return true;
}
return false;
}

void CombinedBuffersRank::ResolveSendBuffersAndSendInfo(Mesh *pmesh) {
// First calculate the total size of the message
int total_buffers{0};
for (auto &[partition, buf_struct_vec] : combined_info)
total_buffers += buf_struct_vec.size();
int total_partitions = combined_info.size();

int mesg_size = nglobal + nper_part * total_partitions + BndId::NDAT * total_buffers;
message.Allocate(mesg_size);

auto &mess_buf = message.buffer();
mess_buf[0] = total_partitions;

// Pack the data
int idx{nglobal};
for (auto &[partition, buf_struct_vec] : combined_info) {
mess_buf[idx++] = partition; // Used as the comm tag
mess_buf[idx++] = buf_struct_vec.size(); // Number of buffers
mess_buf[idx++] = current_size[partition]; // combined size of buffers
for (auto &buf_struct : buf_struct_vec) {
buf_struct.Serialize(&(mess_buf[idx]));
buf_struct.buf = pmesh->boundary_comm_map[GetChannelKey(buf_struct)];
idx += BndId::NDAT;
}
}

message.Send();

// Allocate the combined buffers
int total_size{0};
for (auto &[partition, size] : current_size)
total_size += size;

buf_t alloc_at_once("shared combined buffer", total_size);
int current_position{0};
for (auto &[partition, size] : current_size) {
combined_buffers[partition] =
buf_t(alloc_at_once, std::make_pair(current_position, current_position + size));
current_position += size;
}
buffers_built = true;
}
} // namespace parthenon
Loading
Loading