-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partial socket IO isn't handled #306
Comments
I think it's wrong for vfu_dma_read/write()/vfu_irq_message() to wait for a quick response, there's no guarantee that the response will be quick, plus it doesn't work since the client might send a request which we'll mistake for the response we're waiting for (see #279 for more details). Assuming we fix the above, I think we should reconsider looking at fixing it properly, can we really on it working properly the way it's implemented now if the host is under stress (fragmented memory, lots of context switches, lots of open files)? |
It's definitely brittle right now. There are really only two options to fix:
Only 2. can match the SPDK use case, I think, and it's probably not worse in terms of implementation than 1. It's probably not too painful to do, but I think a lot might depend on callback expectations from SPDK |
At least for now SPDK is not using vfu_dma_read/write()/vfu_irq_message() and not sure if ever use in future. |
We could reject those calls if ATTACH_NB is set, maybe. We still have the reply path of process_request() though. |
I think #296 is related to this in the sense that we have to store information in order to be able to send a response from a different context. The thing that makes it easy is that we don't have to support processing more requests while the single request on the migration region is pending. I think this solves the problem that the library is non-reentrant. |
Blocking I/O in API
Note that neither the listen fd nor the conn fd are non-blocking. Instead, the get_request_sock() callback sets non-blocking recvmsg() flags if the
LIBVFIO_USER_FLAG_ATTACH_NB
flag is set.If we're supposed to be non-blocking, there are multiple paths whereby we might block still:
NB: recv_blocking() is mis-leading: we don't have O_NONBLOCK set anyway, so it doesn't change anything about the socket.
In practice, this is usually not a problem: we're either expecting a quick response (for vfu_dma_read/write()), or we expect to be able to write to the pipe anyway (for process_request() replies)
But it's certainly not good for users such as SPDK.
Short reads/writes
If we ever have short reads or writes (indicating that the message couldn't be handled in one go), then we'll totally break (I think), with no way to recover.
All the sock code is at issue here, namely vfu_send_iovec(), get_msg(), and recv_blocking().
If we're in blocking mode, we could at least loop until we've received a full message.
In non-blocking, we could still loop. In the worst case, we'd sit at 100% CPU and hang the runtime (SPDK). The alternative is for us to somehow save the state, and return from vfu_run_ctx() / vfu_dma_read/write()/vfu_irq_message(), so on the next call, we can pick it up and try to finish the read/write. This has a bunch of problems, though. For starters, the library is not re-entrant, so other contexts generally can't come in and call the library again. (And another SPDK thread could call e.g. vfu_dma_read()). To do this we'd have to effectively re-write the entire library to be properly async event loop based.
In practice, again, we're generally OK, We've not yet seen this issue in practice: the pipe buffer is big in comparison to message size (65K), so as long as the other side of the socket is active, we're generally OK. For reads, the messages are also likely to be smaller than PIPE_BUF.
The text was updated successfully, but these errors were encountered: