Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vfu_dma_read breaks if client sends message in between #279

Closed
tmakatos opened this issue Jan 28, 2021 · 42 comments
Closed

vfu_dma_read breaks if client sends message in between #279

tmakatos opened this issue Jan 28, 2021 · 42 comments
Labels
api-change bug Something isn't working

Comments

@tmakatos
Copy link
Member

tmakatos commented Jan 28, 2021

There are cases where a libvfio-user public function (e.g. vfu_dma_read) sends a message to the client and then synchronously waits for the response. A response from the server cannot be avoided in the DMA read case as it contains the read data. The problem is that right after having sent the message to the client, the client might send its own message, which will be received by the blocking call of the server. However, the server is expecting a completely different message. vfu_dma_read is one of the functions that has this problem, effectively every public function that uses vfu_msg has the same problem, except the version negotiation ones.

vfu_irq_trigger has the same problem, since the server might send a trigger IRQ message and immediatelly receive a new command from the client, while it's expecting to receive the response for the IRQ message it just sent.

A similar situation can occur when the client is waiting for a response from the server and the server happens to send it an VFIO_USER_VM_INTERRUPT. This is a valid use case though, the client should be expecting such messages.

@tmakatos tmakatos added the bug Something isn't working label Jan 28, 2021
@jlevon
Copy link
Collaborator

jlevon commented Mar 10, 2021

There are currently only two paths that have this issue, vfu_dma_read() and vfu_dma_write()
via tran->send_msg().

vfu_irq_message() (what Thanos was referring to above) has been removed for now.

@awarus
Copy link
Contributor

awarus commented Dec 28, 2021

is this issue still not resolved?

@tmakatos
Copy link
Member Author

tmakatos commented Jan 4, 2022

is this issue still not resolved?

Still not resolved.

@mnissler-rivos
Copy link
Contributor

This issue has bitten me, here are some thoughts.

For context, I've been using libvfio-user to hook up PCI device models to hardware PCIe root port implementations, basically by bridging the PCIe transaction layer to VFIO-user. You can think of this as plugging a software-emulated PCIe card into a PCIe hardware design, where the hardware side is augmented with a VFIO-user client that talks to qemu as the vfio-user server.

In this scenario, I can't use memory-mapped DMA, because I need to inject DMA transactions back into PCIe hardware as memory read and write transaction layer packets. So I rely on VFIO-user's message-based DMA via VFIO_USER_DMA_{READ,WRITE} for my DMA transactions (side note: I have a qemu patch to add support for message-based DMA, happy to contribute that). However, with message-based DMA I immediately run into the problem described here: It's pretty common that MMIO traffic arrives while qemu is waiting for the reply to a DMA operation it has requested, and thus the synchronous wait-for-DMA-reply approach doesn't work. So, I'm interested in fixing this problem.

Here are a couple avenues I can think of:

  1. While waiting for a DMA reply message, don't bail if a command arrives, but queue the command message up. Once the DMA reply comes in, process any queued commands. I have a very crude implementation of this to unblock my work, it certainly feels like a hack (see also below).

  2. Rewrite libvfio-user to work fully asynchronously, allowing commands and replies to be received in arbitrary order. This would also require API changes - vfu_sgl_read and vfu_sgl_write would have to become asynchronous and report their result via a callback.

  3. Tackle the root cause of the issue: Don't allow command messages in both directions over the socket. AFAICT, VFIO_USER_DMA_{READ,WRITE} are the only instances of server -> client commands (note that interrupts, which are the other server -> client interaction, already use event fds). So, it seems realistic to deprecate server -> client commands on the socket entirely (they've never worked properly in the libvfio-user implementation in the light of this bug anyways) and do message-based DMA in a different way: Introduce a 3rd flavor of VFIO_USER_DMA_MAP that passes a separate socket file descriptor to the server, which the server then uses to send VFIO_USER_DMA_{READ,WRITE} commands to the client. Note that this DMA socket would carry only carry commands in the server -> client direction. This would be a protocol-level change, but one that fits cleanly into the existing protocol architecture.

Assessing these options:

  • (1) is nice in that it can be done entirely inside libvfio-user, without any API changes. (For completeness, note that vfu_get_poll_fd can no longer be just the socket file descriptor, since it must indicate readiness also for queued messages, but there's a workaround using eventfd + epoll). (2) and (3) imply API changes, and (3) even requires a protocol-level change.

  • (2) is fundamentally the hardest to implement for clients and servers, since synchronous request-reply code can't easily be fixed. Even if we were to make libvfio-user's message handling completely asynchronous, switching DMA API functions to become asynchronous will have ripple effects. For example, qemu's DMA handling is currently synchronous and probably isn't practical to change to asynchronous, thus forcing qemu to implement a variant of (1) to hide the asynchronicity. The implementation of (1) is a bit messy since we need to buffer potentially unlimited messages in vfu_ctx, but perhaps that can be limited to a reasonable value in practice? (3) is pretty simple to implement since it is independent of the main socket.

  • Whatever solution we pick, there's a potential for deadlocks if both sides choose to use synchronous I/O. For example, the server might wait for a DMA operation while the client waits for an MMIO access to complete. At least one side will have to process messages "out of band", and thus implement some form of asynchronicity. IMO, (3) provides the most flexibility in that the client can choose to either be fully asynchronous and listen for the main socket as well as any DMA sockets, or it can choose to service the DMA socket(s) on a different thread. Implementations that don't require message-based DMA can choose simpler synchronous implementations. qemu can keep its synchronous DMA implementation (at the cost of the client needing to be able to handle asynchronous DMA requests, but it seems fair to pay that cost in the client if the client wants message-based DMA).

Overall, I like (3) best. I'm interested to hear other thoughts / opinions. Assuming we can reach consensus on how to move forward, I'll be willing to implement a solution.

@tmakatos
Copy link
Member Author

tmakatos commented Jun 20, 2023

I do like idea (3) the most. I even think we've entairtained this idea of having a separate socket for server -> client messages in the very early days, when we were implementing the protocol specification, but didn't have a solid use case at that point and decided to not bake it in the protocol. I don't think modifying the protocol is an issue, it's still experimental in QEMU (and we're still going to modify the protocol when we adopt migration v2). Let's hear from @jlevon and @jraman567 .

@mnissler-rivos
Copy link
Contributor

Thanks @tmakatos. If anyone sees issues with approach (3) please let me know - I hope to have some cycles next week to start working on this.

@jlevon
Copy link
Collaborator

jlevon commented Jun 23, 2023

plugging a software-emulated PCIe card into a PCIe hardware design

Cool!

note that interrupts, which are the other server -> client interaction, already use event fds

While that's true today, that wouldn't be true if we ever wanted to do other implementations, over TCP/IP to a remote host, for example.

So I don't think it would make sense to build something specific to the DMA path.

This problem is the one of the reasons we didn't want to stabilize the library yet. I agree with the enumeration of the three possible approaches. I don't really like 1).

I'd sort of be OK with 3), though I can't speak for the qemu or vfio-user client maintainers. It feels like a bit of a workaround of inadequate libvfio-user design though, to be honest. It would also mean that async consumers of the API, still need to block during device->guest DMA transactions, which is an issue in itself. The separate socket of 3) doesn't help at all there. That is, the dma read/write APIs are fundamentally broken anyway.

  1. is admittedly a lot more work - but it would then mirror what the client side does, which can handle this problem. So ideally we'd do 2) - Any thoughts on implementing that Mattias? I haven't really thought through the difficulties of implementation in terms of libvfio-user itself, but I think it shouldn't be too difficult:
  • ->send_msg() no longer takes or waits for a reply. In fact we somehow make the support for synchronous request then reply, be only usable by samples/client.c
  • it takes a callback and private data pointer; we store this somewhere: probably need a list/hash for multiple outstanding requests.
  • is_valid_header() changes to allow replies just for DMA
  • handle_request() takes the reply and uses the given callback
  • change quiesce to ensure no DMA op is ongoing somehow
  • same thing for migration state
  • TBD if we need to allow for cancellation: I'd guess not, since we can't control if the client side is going to write into the provided buffer.

@mnissler-rivos
Copy link
Contributor

It feels like a bit of a workaround of inadequate libvfio-user design though, to be honest. It would also mean that async consumers of the API, still need to block during device->guest DMA transactions, which is an issue in itself. The separate socket of 3) doesn't help at all there. That is, the dma read/write APIs are fundamentally broken anyway.

I see what you're saying, but for clarity we should separate the discussion into two concerns:

  1. What kind of DMA APIs we provide: Synchronous, asynchronous or both. The decision should be informed by the needs of the consumers of the API. See below.
  2. What the right protocol-level implementation is. Decision here is mostly determined by whether the protocol can support the APIs we need to support in a reasonable way.

So ideally we'd do 2) - Any thoughts on implementing that Mattias?

The libvfio-user changes to go fully async are relatively straightforward, I agree. The challenge is for code that consumes the API, because consuming async APIs causes ripple effects. If you look at qemu for example, it relies on synchronous pci_dma_read and pci_dma_write APIs. Switching these APIs to async would be a major effort in qemu, and it would mean touching all the hardware model implementations to make their DMA request code paths async. IMO, that's in the ain't-gonna-happen bucket, at least unless qemu has appetite for changing their DMA APIs for additional reasons.

If we assume that qemu hardware models will stick with a synchronous DMA API, then there are two paths:

  1. libvfio-user offers both sync and async DMA APIs and lets the consumer choose what works for them. That's pretty simple to support if we have separate sockets.
  2. libvfio-user's DMA API is async only, and qemu needs to adapt that to a synchronous API somehow. It's questionable whether this is at all possible. The only thing I can think of would probably look similar to the queuing approach I described as (1) above, but at the libvfio-user API surface level - ugh.

Bottom line: I don't really see how a libvfio-user that doesn't provide a synchronous DMA API can be consumed by qemu. This arguably causes some design warts, but a middle ground design that allows both sync/async DMA doesn't seem such a bad way forward, and it also gives API consumers an upgrade path to async.

@jlevon
Copy link
Collaborator

jlevon commented Jun 23, 2023

into two concerns

Sure, but one informs the other: right now, we can't provide an async API, because the underlying API is sync. While it's easier to build a sync API on top of an async implementation.

Switching these APIs to async

Yes, I would expect there to be both sync and async APIs, precisely for the sort of reasons you are mentioning.
Of course, the former would basically need to insist on there being no interleaving still, so would still have the issue. That's true of form 3) as well, since we'd have to return back to run vfu_run_ctx() for the reply anyway.

It's possible the sync version could read then re-queue any intermediate messages, perhaps, but there's risks there in the cases of quiesce, migration, unmap, and so on.

@mnissler-rivos
Copy link
Contributor

It's possible the sync version could read then re-queue any intermediate messages, perhaps, but there's risks there in the cases of quiesce, migration, unmap, and so on.

Plus, it's pretty heavy-weight since at least in theory it needs to queue up an unbounded number of messages of arbitrary sizes. It works though, it's approach (1) in my post above, and I have a hack that I'm currently running with.

It seems like we're all in agreement that we should have async DMA (in addition to sync) at the libvfio-user API level. Then the open question is whether we do the queuing approach, or we introduce separate socket(s) for the server->client direction.

Trying to summarize pros/cons (please call out anything I have missed):

queuing:

  • (+) no protocol changes required
  • (+) conceptually cleaner: single communication channel
  • (-) need to allocate unpredictable amounts of memory to hold queued messages
  • (-) vfio-user implementations are forcibly async with the need to match replies to commands

separate socket(s):

  • (+) can process any message as it's read from the socket, when handling sync DMA other commands stay on the main socket, natural back pressure to client.
  • (+) simplified implementations are possible that handle the sockets using send-request-read-reply loops (and can achieve async by using threads). judging from what's out there, this style is popular, see for example https://github.com/rust-vmm/vfio-user/blob/main/src/lib.rs
  • (-) protocol change

As is probably obvious by know, I'm leaning towards separate socket(s), mostly because I think it's simpler and makes vfio-user easier to implement, but I can totally relate to the single-socket solution feeling conceptually cleaner. In the end, I don't really feel strongly though, just hoping to reach closure so we can move forward.

@jlevon
Copy link
Collaborator

jlevon commented Jun 24, 2023

I'm not clear on how a separate socket helps at all with providing a sync API; since the reply arrives on the original socket still, you still need to somehow process other incoming messages.

@mnissler-rivos
Copy link
Contributor

Ah, so we've been talking past each other, I'm not intending to send the DMA reply on the main socket.

Let me recap the separate socket idea again to clarify:

  • Client obtains a pair of sockets by calling socketpair(), let's call them dma_client_fd, and dma_server_fd, respectively.
  • When calling VFIO_USER_DMA_MAP command, the client passes a flag to indicate a new mode of performing DMA operations (let's call it VFIO_USER_F_DMA_REGION_SOCKET_IO), as well as dma_server_fd.
  • When the server wants to perform DMA for the mapping in question, it puts an VFIO_USER_DMA_READ command on dma_server_fd.
  • The client monitors and reads dma_client_fd. When it receives the VFIO_USER_DMA_READ command, it performs the DMA operation and sends back the reply on dma_client_fd.

This enables the DMA operation to be handled "out of band" of the main socket. Thus, a synchronous write-DMA-command-then-read-back-reply on the sockets are unproblematic (since commands on the main socket are only ever sent by the client, and commands on the DMA socket are only ever sent by the server, the condition that is the root cause of the bug never occurs). Furthermore, the server can choose to synchronously wait for the DMA operation to complete and ignore all other I/O on the main socket during that time (this is what I was referring to when I said this is simpler to implement).

@jlevon
Copy link
Collaborator

jlevon commented Jun 26, 2023

Note that this DMA socket would carry only carry commands in the server -> client direction.

I think I read "messages" here instead of "commands", hence the confusion. It's bi-directional, in fact.

So, yeah, that sounds reasonable.

@jlevon jlevon closed this as completed Jun 26, 2023
@jlevon jlevon reopened this Jun 26, 2023
@jlevon
Copy link
Collaborator

jlevon commented Jun 26, 2023

oops

@jlevon
Copy link
Collaborator

jlevon commented Jun 26, 2023

@mnissler-rivos I asked Jag about this on slack if you'd like to join

@tmakatos
Copy link
Member Author

tmakatos commented Jul 6, 2023

I've just started reading the rest of the discussion, but the following just occured to me:

There are cases where a libvfio-user public function (e.g. vfu_dma_read) sends a message to the client and then synchronously waits for the response.

What's the point of the vfio-user server requiring a response from the client for a DMA command? If the client doesn't like the command it received, the server doesn't need to know about that specific fact, the client might deal with it any way it sees fit (e.g. reset the device). So if we remove this behavior there's no problem in the first place, right?

@jlevon
Copy link
Collaborator

jlevon commented Jul 6, 2023

DMA read requires a response with the data

@mnissler-rivos
Copy link
Contributor

mnissler-rivos commented Jul 6, 2023

What's the point of the vfio-user server requiring a response from the client for a DMA command?

For a DMA read, it wants the data from the client. There is no conceptual reason why that couldn't be delivered asynchronously though. In case of qemu, the device models often have code that perform DMA reads to get the next network packet and then process it, for example, and that code is mostly synchronous right now. So, IMHO we should definitely provide an async API, but keep a synchronous version to avoid breaking more simplistic server designs.

@tmakatos
Copy link
Member Author

tmakatos commented Jul 6, 2023

DMA read requires a response with the data

D'oh of course.

Furthermore, the server can choose to synchronously wait for the DMA operation to complete and ignore all other I/O on the main socket during that time (this is what I was referring to when I said this is simpler to implement).

What if the client never responds to the DMA read? What if it decides to reset the device, so the server should actually be paying attention to the main socket? I guess the client will have to start making assumptions that the server won't be paying attention to the main socket so it first has to fail the DMA read?

@jlevon
Copy link
Collaborator

jlevon commented Jul 6, 2023

Yep - we'll need to clearly define exact behaviour in these sorts of cases

@tmakatos
Copy link
Member Author

tmakatos commented Jul 6, 2023

I guess it boils down to what real HW would do: if it was in the process of doing DMA would it be handling MMIO in the first place?

@mnissler-rivos
Copy link
Contributor

I don't think there's a universal answer on how hardware behaves. At the PCIe TLP level, all packets are tagged in order to match up completions, so there aren't any inherent restrictions. I'm also not aware of any requirements to that effect in the PCIe spec, but I'll ask a coworker who knows it much better than me to make sure. That said, the spec does include a completion timeout mechanism (section 2.8) to deal with situations where an expected reply never arrives.

From a VFIO software perspective, robust clients and servers should ideally not make any assumptions on communication patterns in my opinion. Having implemented a fully asynchronous client now, I am painfully aware though that this can be challenging in practice. Plus, you can't just mandate full async and ignore the practical constraints in existing software such as qemu.

Note that separating the sockets offer increased flexibility in implementing a correct client: It can spawn a thread to handle DMA separately from the main socket, and unless there is problematic cross-thread synchronization, everything will be fully async safe even without switching to an event-based design.

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Jul 6, 2023
This change introduces a new mode for servers to access DMA regions in
the client. It reuses the existing VFIO_USER_DMA_{READ,WRITE} message
type and format, but messages are not passed across the main
communication socket. Instead, communication happens via a separate
socket that gets supplied by the client when invoking VFIO_USER_DMA_MAP.

The benefit of the new mode is to avoid situations where both client and
server send messages at the same time, then expect the response as the
next message that is received. When socket-based DMA access mode is
used, commands on sockets are only ever sent by one side, side-stepping
the command collision problem. For more details and disucssion, please
see nutanix#279.
mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Jul 6, 2023
This change introduces a new mode for servers to access DMA regions in
the client. It reuses the existing VFIO_USER_DMA_{READ,WRITE} message
type and format, but messages are not passed across the main
communication socket. Instead, communication happens via a separate
socket that gets supplied by the client when invoking VFIO_USER_DMA_MAP.

The benefit of the new mode is to avoid situations where both client and
server send messages at the same time, then expect the response as the
next message that is received. When socket-based DMA access mode is
used, commands on sockets are only ever sent by one side, side-stepping
the command collision problem. For more details and disucssion, please
see nutanix#279.
@jlevon
Copy link
Collaborator

jlevon commented Jul 18, 2023

not sure I have an opinion on that

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Jul 19, 2023
This change introduces a new mode for servers to access DMA regions in
the client. It reuses the existing VFIO_USER_DMA_{READ,WRITE} message
type and format, but messages are not passed across the main
communication socket. Instead, communication happens via a separate
socket that gets supplied by the client when invoking VFIO_USER_DMA_MAP.

The benefit of the new mode is to avoid situations where both client and
server send messages at the same time, then expect the response as the
next message that is received. When socket-based DMA access mode is
used, commands on sockets are only ever sent by one side, side-stepping
the command collision problem. For more details and disucssion, please
see nutanix#279.
@mnissler-rivos
Copy link
Contributor

Here's a fresh snapshot of what I have right now: mnissler-rivos/libvfio-user@master...mnissler-rivos:libvfio-user:socket_dma

Includes test fixes and a pytest for vfu_sgl_{read,write}.

Note that this is still passing the socket fd in DMA_MAP, since I had this almost ready when we started discussing this angle yesterday. I will investigate how a solution that sets up a context-scoped server -> client socket next.

Happy to hear any input you may have, but no pressure, I still got work to do :-)

@tmakatos
Copy link
Member Author

@mnissler-rivos I think it's mature enough to become a a PR (so we don't have to re-review it, I've already reviewed part of it), passing the sockect fd during negotiate can be fixed up by additional commits.

@mnissler-rivos
Copy link
Contributor

@tmakatos I hear you - but please give me a day or so to investigate the context-scoped reverse socket approach.

There's a chance that with the alternative implementation I can hide all the details of the reverse socket in the transport layer. If that succeeds, most of the changes I have on the branch won't be necessary.

@jraman567
Copy link
Contributor

@mnissler-rivos Setting up the reverse communication channel during negotiation sounds good to me too. Thank you!

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Jul 20, 2023
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue nutanix#279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket.
@mnissler-rivos
Copy link
Contributor

Here's a proof of concept for setting up a separate server->client command socket at negotiation time: mnissler-rivos/libvfio-user@master...mnissler-rivos:libvfio-user:cmd_socket

It ended up cleaner than I had feared, and it's future-proof in the sense that it transparently moves all server->client commands over to the separate socket.

I haven't finished a suitable pytest yet, but I have test against my client and confirmed that it works.

It'd be great if you could weigh in on whether this approach is indeed what you want to go with - I'll be happy to turn this into a proper PR then.

@jlevon
Copy link
Collaborator

jlevon commented Aug 2, 2023

That is indeed a lot less than I expected! I have some comments but looks broadly fine, please open 2 PRs for the separate changes, and don't forget to update the vfio-user spec too. Thanks!

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 14, 2023
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue nutanix#279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket.
mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 15, 2023
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue nutanix#279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket. Corresponding replies are also passed back
over the new socket pair.
@mnissler-rivos
Copy link
Contributor

Sorry for the delay, I was on vacation. I've just created a pull request that sets up a separate socket pair at negotiation time, complete with tests and specification updates.

@jlevon you asked for 2PRs above - not sure whether you'd like me to split this one or see a separate PR for the alternative approach of setting up a socket FD for each DMA mapping?

@jlevon
Copy link
Collaborator

jlevon commented Aug 15, 2023

@mnissler-rivos close_safely() is unrelated to these changes so should be a separate PR (with the other on top I guess).

@mnissler-rivos
Copy link
Contributor

Ah, makes sense. Here is a separate pull request for close_safely: #763

mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 15, 2023
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue nutanix#279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket. Corresponding replies are also passed back
over the new socket pair.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
mnissler-rivos added a commit to mnissler-rivos/libvfio-user that referenced this issue Aug 15, 2023
This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue nutanix#279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket. Corresponding replies are also passed back
over the new socket pair.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
jlevon pushed a commit that referenced this issue Sep 15, 2023
Use separate socket for server->client commands

This change adds support for a separate socket to carry commands in the
server-to-client direction. It has proven problematic to send commands
in both directions over a single socket, since matching replies to
commands can become non-trivial when both sides send commands at the same
time and adds significant complexity. See issue #279 for details.

To set up the reverse communication channel, the client indicates
support for it via a new capability flag in the version message. The
server will then create a fresh pair of sockets and pass one end to the
client in its version reply. When the server wishes to send commands to
the client at a later point, it now uses its end of the new socket pair
rather than the main socket. Corresponding replies are also passed back
over the new socket pair.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
@jlevon
Copy link
Collaborator

jlevon commented Sep 15, 2023

fixed via #762

@jlevon jlevon closed this as completed Sep 15, 2023
mnissler-rivos added a commit to mnissler-rivos/qemu that referenced this issue Dec 8, 2023
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue Feb 12, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240212080617.2559498-5-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue Feb 12, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240212080617.2559498-5-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue Mar 4, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240304100554.1143763-5-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue Mar 4, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240304100554.1143763-5-mnissler@rivosinc.com>
rivos-eblot pushed a commit to rivos-eblot/qemu-ot that referenced this issue Apr 9, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue May 7, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240507094210.300566-5-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue May 7, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240507143431.464382-7-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue May 7, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240507143431.464382-7-mnissler@rivosinc.com>
patchew-importer pushed a commit to patchew-project/qemu that referenced this issue May 7, 2024
Wire up support for DMA for the case where the vfio-user client does not
provide mmap()-able file descriptors, but DMA requests must be performed
via the VFIO-user protocol. This installs an indirect memory region,
which already works for pci_dma_{read,write}, and pci_dma_map works
thanks to the existing DMA bounce buffering support.

Note that while simple scenarios work with this patch, there's a known
race condition in libvfio-user that will mess up the communication
channel. See nutanix/libvfio-user#279 for
details as well as a proposed fix.

Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Message-Id: <20240507143431.464382-7-mnissler@rivosinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants