You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, the MPI library for Level Zero and CUDA can map the device memory to the host. It is done for performance reasons: In case of short messages mapping to host and calling host memcpy works faster than copying via L0 APIs.
Description
TBD
API Changes
Need to decided on the right API
Implementation details
In case of Level Zero the Level Zero IPC API can be used. Today MPI extracts file descriptor from the IPC handle and use mmap function to map memory to the host.
In case of CUDA the gdrcopy library is used.
Meta
The text was updated successfully, but these errors were encountered:
Rationale
Today, the MPI library for Level Zero and CUDA can map the device memory to the host. It is done for performance reasons: In case of short messages mapping to host and calling host memcpy works faster than copying via L0 APIs.
Description
TBD
API Changes
Need to decided on the right API
Implementation details
In case of Level Zero the Level Zero IPC API can be used. Today MPI extracts file descriptor from the IPC handle and use
mmap
function to map memory to the host.In case of CUDA the gdrcopy library is used.
Meta
The text was updated successfully, but these errors were encountered: