Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/ofi: refactor MPIDI_OFI_request_t #6895

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

Commits on Feb 6, 2024

  1. misc: rename MPIR_gpu_req to MPIR_async_req

    MPIR_gpu_req is a union type for either a MPL_gpu_request or a
    MPIR_Typerep_req, thus it is not just for gpu. Potentially this type can
    be extended to include other internal async task handles. Thus we rename
    it to MPIR_async_req.
    
    We also establish the convention of naming the variable async_req.
    hzhou committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    7ab6c80 View commit details
    Browse the repository at this point in the history
  2. misc: add MPIR_async_test

    Add an inline wrapper for testing MPIR_async_req.
    
    Modify the order of header inclusion due to the dependency on
    typerep_pre.h.
    hzhou committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    0cc53d6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ac6aa4c View commit details
    Browse the repository at this point in the history
  4. ch4/ofi: refactor pipeline recv async copy

    Refactor the async copy in receive events using MPIR_async facilities.
    hzhou committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    b510cad View commit details
    Browse the repository at this point in the history
  5. ch4/ofi: refactor pipeline send async copy

    Refactor the async copy before sending a chunk.
    hzhou committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    6727ffc View commit details
    Browse the repository at this point in the history
  6. ch4/ofi: remove MPIDI_OFI_gpu_progress_task

    Both gpu_send_task_queue and gpu_recv_task_queue have been ported to
    async things.
    hzhou committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    6486c72 View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. ch4/ofi: refactor pipeline send

    Pipeline send allocates chunk buffers then spawns async copy. The
    allocation may run out of genq buffers, thus it is disigned as async
    tasks.
    
    The send copy are triggered upon completion of buffer alloc, thus it is
    renamed into spawn_send_copy and turned into internal static function.
    
    This removes MPIDI_OFI_global.gpu_send_queue.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    5b21c7c View commit details
    Browse the repository at this point in the history
  2. ch4/ofi: refactor pipeline recv

    Pipeline recv allocates chunk buffers and then post fi_trecv. The
    allocation may run out of genq buffers and we also control the number of
    outstanding recvs, thus it is designed as async tasks.
    
    The async recv copy are triggered in recv event when data arrives.
    
    This removes MPIDI_OFI_global.gpu_recv_queue.
    
    All ofi-layer progress routines for gpu pipelining are now removed.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    3519d1d View commit details
    Browse the repository at this point in the history
  3. ch4/ofi: move gpu pipeline events into ofi_gpu_pipeline.c

    Consolidate the gpu pipeline code.
    
    MPIDI_OFI_gpu_pipeline_request is now an internal struct in
    ofi_gpu_pipeline.c, rename to struct chunk_req.
    
    MPIDI_OFI_gpu_pipeline_recv_copy is now an internal function, rename to
    start_recv_copy.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    eb317db View commit details
    Browse the repository at this point in the history
  4. ch4/ofi: move all gpu pipeline code into ofi_gpu_pipeline.c

    Move all gpu pipeline specific code into ofi_gpu_pipeline.c.
    
    Make a new function MPIDI_OFI_gpu_pipeline_recv that fills rreq with
    persistent pipeline_info data. Rename the original
    MPIDI_OFI_gpu_pipeline_recv into static function start_recv_chunk.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    b910e10 View commit details
    Browse the repository at this point in the history
  5. ch4/ofi: refactor pipeline_info into a union

    Make the code cleaner to separate the pipeline_info type into a union of
    send and recv.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    c41c4be View commit details
    Browse the repository at this point in the history
  6. ch4/ofi: use explicit counters to track gpu pipeline

    Don't mix the usage of cc_ptr, use separate and explicit counters to
    track the progress and completion of chunks.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    1c4a9f0 View commit details
    Browse the repository at this point in the history
  7. ch4/ofi: use internal tag for pipeline chunk match_bits

    Follow a similar approach as nonblocking collectives, internal pipeline
    chunks use separate tag space (MPIDI_OFI_GPU_PIPELINE_SEND) and
    incrementing tags to avoid mismatch with regular messages.
    hzhou committed Feb 7, 2024
    Configuration menu
    Copy the full SHA
    ffafce1 View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. ch4/ofi: refactor gpu pipeline recv_alloc

    Separate the recv tasks between the initial header and chunks since the
    paths clearly separates them.
    
    Use a single async item for all chunk recvs rather than unnecessarily
    enqueuing individual chunks since we can track the chunks in the state.
    hzhou committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    4cc74c6 View commit details
    Browse the repository at this point in the history
  2. ch4/ofi: include ofi_impl.h in ofi_gpu_pipeline.c

    It is needed to compile under noinline configuration.
    hzhou committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    c24a4a9 View commit details
    Browse the repository at this point in the history
  3. ch4/ofi: move some inline util functions

    Move these utility functions to ofi_impl.h since they are simple and
    non-specific. It also simplifies figuring out which file to include
    especially for .c files.
    hzhou committed Feb 8, 2024
    Configuration menu
    Copy the full SHA
    bdd903e View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2024

  1. ---- START HERE ----

    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    207c5d3 View commit details
    Browse the repository at this point in the history
  2. ch4/ofi: re-organize MPIDI_OFI_request_t noncontig union

    Separate non-intercepting paths with distinct union members to make the
    code more explicit.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    fa962d7 View commit details
    Browse the repository at this point in the history
  3. ch4/ofi: merge pipeline_info in MPIDI_OFI_request_t

    The paths of pipeline code is not intercepting with other paths, thus
    pipeline_info can be part of the same union.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    59dc0fb View commit details
    Browse the repository at this point in the history
  4. ch4/ofi: add huge_send in MPIDI_OFI_request_t union

    Rather than sharing the pack_buffer field in pack_send, which obfuscates
    the code, use a separate union member for the huge send path.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    e8ab242 View commit details
    Browse the repository at this point in the history
  5. ch4/ofi: move huge.remote_info in MPIDI_OFI_request_t

    It is part of the recv path, thus need be in the recv struct.
    
    The recv paths may switch paths depending on the actual protocols
    used by sender, thus some of the fields from different paths need live
    in the same struct and use the NULL sentinel to tell whether certain
    path is in effect. Currently this include remote_info for huge_send
    protocol, and pack_buffer for pack_recv. It's possible to have
    huge_recv+pack_recv.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    a297352 View commit details
    Browse the repository at this point in the history
  6. ch4/ofi: move inject_buf to MPIDI_OFI_request_t.u

    The am emulated inject path does not intercept with any native paths,
    thus it should be part of the big union.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    fe6b73c View commit details
    Browse the repository at this point in the history
  7. ch4/ofi: move the util.iov into MPIDI_OFI_request_t.u

    The util.iov field is used by fi_trecvmsg with FI_CLAIM flag or the huge
    recv path for threshold checking and fi_read.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    ccd0b27 View commit details
    Browse the repository at this point in the history
  8. ch4/ofi: refactor MPIDI_OFI_dispatch_function to a big switch

    The fast path of MPIDI_OFI_EVENT_SEND and MPIDI_OFI_EVENT_RECV has
    already been handled in MPIDI_OFI_dispatch_optimized. Thus it is not
    critical to mark likely/unlikely branches in
    MPIDI_OFI_dispatch_function. Just use a big switch for simplicity.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    79ae1fe View commit details
    Browse the repository at this point in the history
  9. ch4/ofi: assert for unexpected pipeline data

    There is no way to recover a pipelined message in the native ofi path if
    the receiver didn't expect it. We could always save the necessary
    information in all recv paths, but I am not sure we are willing to
    accept the overhead, which harms the small message latencies.
    
    Assert failure for now.
    
    The proper pipeline implementation need to happen as active messages or
    even at the MPIR-layer.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    66bfcc6 View commit details
    Browse the repository at this point in the history
  10. ch4/ofi: Remove a redundant assignment

    Remove a redundant assignment of *request.
    hzhou committed Feb 9, 2024
    Configuration menu
    Copy the full SHA
    647751e View commit details
    Browse the repository at this point in the history