You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
HG_Respond with extra buf lead to libfabric run out of rxm recv_entry
found this when test daos project
When testing the Daos project, I discovered a bug where on the server side, calling HG_Respond with an extra buffer after the client RPC timeout and calling HG_Cancel, the client won't handle the response, leading to no acknowledgment being sent to the server.
The server posts a recv_expected to wait for the client ack, which consumes a recv_entry in ofi_rxm. Eventually, recv_entry runs out, causing this hg_context unable to post any more recv .
The expected behavior would be for HG/NA to take action to drop the recv_expected (waiting for ack ) to free up recv_entry. Consider adding a timeout mechanism maybe.
//
We are currently adding a timeout detection mechanism to crt_reply to prevent this from happening
The text was updated successfully, but these errors were encountered:
Describe the bug
HG_Respond with extra buf lead to libfabric run out of rxm recv_entry
found this when test daos project
When testing the Daos project, I discovered a bug where on the server side, calling HG_Respond with an extra buffer after the client RPC timeout and calling HG_Cancel, the client won't handle the response, leading to no acknowledgment being sent to the server.
The server posts a recv_expected to wait for the client ack, which consumes a recv_entry in ofi_rxm. Eventually, recv_entry runs out, causing this hg_context unable to post any more recv .
The expected behavior would be for HG/NA to take action to drop the recv_expected (waiting for ack ) to free up recv_entry. Consider adding a timeout mechanism maybe.
//
We are currently adding a timeout detection mechanism to crt_reply to prevent this from happening
The text was updated successfully, but these errors were encountered: