Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/ofi: use cq_data to carry message size #7138

Merged
merged 4 commits into from
Nov 6, 2024
Merged

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Sep 11, 2024

Pull Request Description

Following #7192, if the provider supports 64-bit cq_data, we can carry the data_sz in the cq_data (up to 4GB), which can save a roundtrip for probing.

Diagrams

image

[skip warnings]

TODOs

  • * Implement in netmod/ofi - am-only
  • * implement in netmod/ucx - am-only
  • * benchmark
  • * large message and ssend to use am RNDV (tricky)
  • * remove or replace ofi huge send
  • * remove or replace ofi pipeline send

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou hzhou force-pushed the 2409_am_data branch 5 times, most recently from 67978a2 to 5af8c7c Compare October 21, 2024 02:01
@hzhou hzhou force-pushed the 2409_am_data branch 2 times, most recently from f2a8b88 to fd200e6 Compare October 30, 2024 15:13
@hzhou hzhou marked this pull request as ready for review October 30, 2024 15:14
@hzhou hzhou changed the title ch4: active message tag send (WIP) ch4/ofi: use cq_data to carry message size Oct 30, 2024
@hzhou hzhou added the 4.3.0b1 label Oct 30, 2024
@hzhou hzhou force-pushed the 2409_am_data branch 3 times, most recently from 6d7f873 to 6490589 Compare November 1, 2024 01:55
This was a leftover when we refactored dynamic process handshake in
1e7e5ce.
It's an optional feature that we can take advantage of, e.g. carry more
meta data.
If we have 8-byte cq_data, we can use the higher bits to carry data_sz.
This saves a round trip for MPI_{Probe,Mprobe,Iprobe,Improbe}.
Temporary turn this on for testing purpose.
@hzhou
Copy link
Contributor Author

hzhou commented Nov 5, 2024

test:mpich/ch4/most
test:mpich/ch3/most
test:mpich/ch4/ofi/more

@hzhou hzhou merged commit e38d557 into pmodels:main Nov 6, 2024
8 of 9 checks passed
@hzhou hzhou deleted the 2409_am_data branch November 6, 2024 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants