-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TL/UCP: reduce_scatter_ring #413
TL/UCP: reduce_scatter_ring #413
Conversation
This is part of #361. It contains only reduce_scatter ring changes (and some utils that are required). @Sergei-Lebedev @bureddy @manjugv plz review. |
{ | ||
int polls = 0; | ||
while (polls++ < task->n_polls) { | ||
if (task->send_posted - task->send_completed <= 1 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ( ) around "task->send_posted - task->send_completed"
why it is ok one send to not complete but return UCC_OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce sync/delays the algorithm uses 2 scratch buffers for incoming data (and to do reduction). They alternate each time. I don't care about send completions at all, i only need to make sure i have at least 1 free scratch to post next recv, ie number of outstanding sends must be < 2 (0 - both scratches free, 1 - only one).
ucc_rank_t max_rank; | ||
|
||
if (ucc_ep_map_is_reverse(&map)) { | ||
inv = map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function intention is to inverse the given map and here your seting the same map to inv. isn't possible to call inverse only if the map is not reverse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THe intent was to keep the client (tl/ucp) code clean. In case of bidirectional alg, tl/ucp ring always allocates just 2 rings 1 default + 1 reverse. I need to have inverse map for all rings to implement the alg. In this case when my ring are straight forward i could just omit call to inverse in tl/ucp since it evaluates to the original ring. However, when we had custom rings used in the same alg (example was HCM, where rings would be topo dependent) then inverse compute is necessary. But the code stays the same, we can just pass a new "map" to the alg.
schedule = &tl_schedule->super.super; | ||
/* if count == size then we have 1 elem per rank, not enough | ||
to split into 2 sets */ | ||
n_subsets = (bidir && (count > size)) ? 2 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this condition work for inplace? i think should divide by size first in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Allow running test_mpi reduce_scatter when count % comm_size != 0
3e6851e
to
d1e2d96
Compare
@Sergei-Lebedev @bureddy fixed scratch race by using custom callbacks as Sergey recommended in WG. |
* TL/UCP: tl_ucp_schedule_t * UTIL: ep_map reverse * UTIL: buffer devision into blocks * TL/UCP: reduce_scatter_ring * TEST: reduce_scatter in gtest * TEST: mpi reduce_scatter fix Allow running test_mpi reduce_scatter when count % comm_size != 0 * TL/UCP: rs ring 2 cb * REVIEW: address comments
What
Adds implementation of bi- uni- directional ring reduce_scatterin TL/UCP
Why ?
Core API. Also used as part of 3-stage allreduce.