vfu_dma_read breaks if client sends message in between #279

tmakatos · 2021-01-28T12:23:11Z

There are cases where a libvfio-user public function (e.g. vfu_dma_read) sends a message to the client and then synchronously waits for the response. A response from the server cannot be avoided in the DMA read case as it contains the read data. The problem is that right after having sent the message to the client, the client might send its own message, which will be received by the blocking call of the server. However, the server is expecting a completely different message. vfu_dma_read is one of the functions that has this problem, effectively every public function that uses vfu_msg has the same problem, except the version negotiation ones.

vfu_irq_trigger has the same problem, since the server might send a trigger IRQ message and immediatelly receive a new command from the client, while it's expecting to receive the response for the IRQ message it just sent.

A similar situation can occur when the client is waiting for a response from the server and the server happens to send it an VFIO_USER_VM_INTERRUPT. This is a valid use case though, the client should be expecting such messages.

The text was updated successfully, but these errors were encountered:

jlevon · 2021-03-10T11:09:58Z

There are currently only two paths that have this issue, vfu_dma_read() and vfu_dma_write()
via tran->send_msg().

vfu_irq_message() (what Thanos was referring to above) has been removed for now.

awarus · 2021-12-28T10:36:33Z

is this issue still not resolved?

tmakatos · 2022-01-04T11:57:05Z

is this issue still not resolved?

Still not resolved.

mnissler-rivos · 2023-06-19T23:46:18Z

This issue has bitten me, here are some thoughts.

For context, I've been using libvfio-user to hook up PCI device models to hardware PCIe root port implementations, basically by bridging the PCIe transaction layer to VFIO-user. You can think of this as plugging a software-emulated PCIe card into a PCIe hardware design, where the hardware side is augmented with a VFIO-user client that talks to qemu as the vfio-user server.

In this scenario, I can't use memory-mapped DMA, because I need to inject DMA transactions back into PCIe hardware as memory read and write transaction layer packets. So I rely on VFIO-user's message-based DMA via VFIO_USER_DMA_{READ,WRITE} for my DMA transactions (side note: I have a qemu patch to add support for message-based DMA, happy to contribute that). However, with message-based DMA I immediately run into the problem described here: It's pretty common that MMIO traffic arrives while qemu is waiting for the reply to a DMA operation it has requested, and thus the synchronous wait-for-DMA-reply approach doesn't work. So, I'm interested in fixing this problem.

Here are a couple avenues I can think of:

While waiting for a DMA reply message, don't bail if a command arrives, but queue the command message up. Once the DMA reply comes in, process any queued commands. I have a very crude implementation of this to unblock my work, it certainly feels like a hack (see also below).
Rewrite libvfio-user to work fully asynchronously, allowing commands and replies to be received in arbitrary order. This would also require API changes - vfu_sgl_read and vfu_sgl_write would have to become asynchronous and report their result via a callback.
Tackle the root cause of the issue: Don't allow command messages in both directions over the socket. AFAICT, VFIO_USER_DMA_{READ,WRITE} are the only instances of server -> client commands (note that interrupts, which are the other server -> client interaction, already use event fds). So, it seems realistic to deprecate server -> client commands on the socket entirely (they've never worked properly in the libvfio-user implementation in the light of this bug anyways) and do message-based DMA in a different way: Introduce a 3rd flavor of VFIO_USER_DMA_MAP that passes a separate socket file descriptor to the server, which the server then uses to send VFIO_USER_DMA_{READ,WRITE} commands to the client. Note that this DMA socket would carry only carry commands in the server -> client direction. This would be a protocol-level change, but one that fits cleanly into the existing protocol architecture.

Assessing these options:

(1) is nice in that it can be done entirely inside libvfio-user, without any API changes. (For completeness, note that vfu_get_poll_fd can no longer be just the socket file descriptor, since it must indicate readiness also for queued messages, but there's a workaround using eventfd + epoll). (2) and (3) imply API changes, and (3) even requires a protocol-level change.
(2) is fundamentally the hardest to implement for clients and servers, since synchronous request-reply code can't easily be fixed. Even if we were to make libvfio-user's message handling completely asynchronous, switching DMA API functions to become asynchronous will have ripple effects. For example, qemu's DMA handling is currently synchronous and probably isn't practical to change to asynchronous, thus forcing qemu to implement a variant of (1) to hide the asynchronicity. The implementation of (1) is a bit messy since we need to buffer potentially unlimited messages in vfu_ctx, but perhaps that can be limited to a reasonable value in practice? (3) is pretty simple to implement since it is independent of the main socket.
Whatever solution we pick, there's a potential for deadlocks if both sides choose to use synchronous I/O. For example, the server might wait for a DMA operation while the client waits for an MMIO access to complete. At least one side will have to process messages "out of band", and thus implement some form of asynchronicity. IMO, (3) provides the most flexibility in that the client can choose to either be fully asynchronous and listen for the main socket as well as any DMA sockets, or it can choose to service the DMA socket(s) on a different thread. Implementations that don't require message-based DMA can choose simpler synchronous implementations. qemu can keep its synchronous DMA implementation (at the cost of the client needing to be able to handle asynchronous DMA requests, but it seems fair to pay that cost in the client if the client wants message-based DMA).

Overall, I like (3) best. I'm interested to hear other thoughts / opinions. Assuming we can reach consensus on how to move forward, I'll be willing to implement a solution.

tmakatos · 2023-06-20T16:00:28Z

I do like idea (3) the most. I even think we've entairtained this idea of having a separate socket for server -> client messages in the very early days, when we were implementing the protocol specification, but didn't have a solid use case at that point and decided to not bake it in the protocol. I don't think modifying the protocol is an issue, it's still experimental in QEMU (and we're still going to modify the protocol when we adopt migration v2). Let's hear from @jlevon and @jraman567 .

mnissler-rivos · 2023-06-22T16:05:41Z

Thanks @tmakatos. If anyone sees issues with approach (3) please let me know - I hope to have some cycles next week to start working on this.

jlevon · 2023-06-23T11:12:33Z

plugging a software-emulated PCIe card into a PCIe hardware design

Cool!

note that interrupts, which are the other server -> client interaction, already use event fds

While that's true today, that wouldn't be true if we ever wanted to do other implementations, over TCP/IP to a remote host, for example.

So I don't think it would make sense to build something specific to the DMA path.

This problem is the one of the reasons we didn't want to stabilize the library yet. I agree with the enumeration of the three possible approaches. I don't really like 1).

I'd sort of be OK with 3), though I can't speak for the qemu or vfio-user client maintainers. It feels like a bit of a workaround of inadequate libvfio-user design though, to be honest. It would also mean that async consumers of the API, still need to block during device->guest DMA transactions, which is an issue in itself. The separate socket of 3) doesn't help at all there. That is, the dma read/write APIs are fundamentally broken anyway.

is admittedly a lot more work - but it would then mirror what the client side does, which can handle this problem. So ideally we'd do 2) - Any thoughts on implementing that Mattias? I haven't really thought through the difficulties of implementation in terms of libvfio-user itself, but I think it shouldn't be too difficult:

->send_msg() no longer takes or waits for a reply. In fact we somehow make the support for synchronous request then reply, be only usable by samples/client.c
it takes a callback and private data pointer; we store this somewhere: probably need a list/hash for multiple outstanding requests.
is_valid_header() changes to allow replies just for DMA
handle_request() takes the reply and uses the given callback
change quiesce to ensure no DMA op is ongoing somehow
same thing for migration state
TBD if we need to allow for cancellation: I'd guess not, since we can't control if the client side is going to write into the provided buffer.

mnissler-rivos · 2023-06-23T15:33:19Z

It feels like a bit of a workaround of inadequate libvfio-user design though, to be honest. It would also mean that async consumers of the API, still need to block during device->guest DMA transactions, which is an issue in itself. The separate socket of 3) doesn't help at all there. That is, the dma read/write APIs are fundamentally broken anyway.

I see what you're saying, but for clarity we should separate the discussion into two concerns:

What kind of DMA APIs we provide: Synchronous, asynchronous or both. The decision should be informed by the needs of the consumers of the API. See below.
What the right protocol-level implementation is. Decision here is mostly determined by whether the protocol can support the APIs we need to support in a reasonable way.

So ideally we'd do 2) - Any thoughts on implementing that Mattias?

The libvfio-user changes to go fully async are relatively straightforward, I agree. The challenge is for code that consumes the API, because consuming async APIs causes ripple effects. If you look at qemu for example, it relies on synchronous pci_dma_read and pci_dma_write APIs. Switching these APIs to async would be a major effort in qemu, and it would mean touching all the hardware model implementations to make their DMA request code paths async. IMO, that's in the ain't-gonna-happen bucket, at least unless qemu has appetite for changing their DMA APIs for additional reasons.

If we assume that qemu hardware models will stick with a synchronous DMA API, then there are two paths:

libvfio-user offers both sync and async DMA APIs and lets the consumer choose what works for them. That's pretty simple to support if we have separate sockets.
libvfio-user's DMA API is async only, and qemu needs to adapt that to a synchronous API somehow. It's questionable whether this is at all possible. The only thing I can think of would probably look similar to the queuing approach I described as (1) above, but at the libvfio-user API surface level - ugh.

Bottom line: I don't really see how a libvfio-user that doesn't provide a synchronous DMA API can be consumed by qemu. This arguably causes some design warts, but a middle ground design that allows both sync/async DMA doesn't seem such a bad way forward, and it also gives API consumers an upgrade path to async.

jlevon · 2023-06-23T20:10:34Z

into two concerns

Sure, but one informs the other: right now, we can't provide an async API, because the underlying API is sync. While it's easier to build a sync API on top of an async implementation.

Switching these APIs to async

Yes, I would expect there to be both sync and async APIs, precisely for the sort of reasons you are mentioning.
Of course, the former would basically need to insist on there being no interleaving still, so would still have the issue. That's true of form 3) as well, since we'd have to return back to run vfu_run_ctx() for the reply anyway.

It's possible the sync version could read then re-queue any intermediate messages, perhaps, but there's risks there in the cases of quiesce, migration, unmap, and so on.

mnissler-rivos · 2023-06-24T05:01:34Z

It's possible the sync version could read then re-queue any intermediate messages, perhaps, but there's risks there in the cases of quiesce, migration, unmap, and so on.

Plus, it's pretty heavy-weight since at least in theory it needs to queue up an unbounded number of messages of arbitrary sizes. It works though, it's approach (1) in my post above, and I have a hack that I'm currently running with.

It seems like we're all in agreement that we should have async DMA (in addition to sync) at the libvfio-user API level. Then the open question is whether we do the queuing approach, or we introduce separate socket(s) for the server->client direction.

Trying to summarize pros/cons (please call out anything I have missed):

queuing:

(+) no protocol changes required
(+) conceptually cleaner: single communication channel
(-) need to allocate unpredictable amounts of memory to hold queued messages
(-) vfio-user implementations are forcibly async with the need to match replies to commands

separate socket(s):

(+) can process any message as it's read from the socket, when handling sync DMA other commands stay on the main socket, natural back pressure to client.
(+) simplified implementations are possible that handle the sockets using send-request-read-reply loops (and can achieve async by using threads). judging from what's out there, this style is popular, see for example https://github.com/rust-vmm/vfio-user/blob/main/src/lib.rs
(-) protocol change

As is probably obvious by know, I'm leaning towards separate socket(s), mostly because I think it's simpler and makes vfio-user easier to implement, but I can totally relate to the single-socket solution feeling conceptually cleaner. In the end, I don't really feel strongly though, just hoping to reach closure so we can move forward.

jlevon · 2023-06-24T14:54:27Z

I'm not clear on how a separate socket helps at all with providing a sync API; since the reply arrives on the original socket still, you still need to somehow process other incoming messages.

mnissler-rivos · 2023-06-26T13:23:58Z

Ah, so we've been talking past each other, I'm not intending to send the DMA reply on the main socket.

Let me recap the separate socket idea again to clarify:

Client obtains a pair of sockets by calling socketpair(), let's call them dma_client_fd, and dma_server_fd, respectively.
When calling VFIO_USER_DMA_MAP command, the client passes a flag to indicate a new mode of performing DMA operations (let's call it VFIO_USER_F_DMA_REGION_SOCKET_IO), as well as dma_server_fd.
When the server wants to perform DMA for the mapping in question, it puts an VFIO_USER_DMA_READ command on dma_server_fd.
The client monitors and reads dma_client_fd. When it receives the VFIO_USER_DMA_READ command, it performs the DMA operation and sends back the reply on dma_client_fd.

This enables the DMA operation to be handled "out of band" of the main socket. Thus, a synchronous write-DMA-command-then-read-back-reply on the sockets are unproblematic (since commands on the main socket are only ever sent by the client, and commands on the DMA socket are only ever sent by the server, the condition that is the root cause of the bug never occurs). Furthermore, the server can choose to synchronously wait for the DMA operation to complete and ignore all other I/O on the main socket during that time (this is what I was referring to when I said this is simpler to implement).

jlevon · 2023-06-26T14:33:10Z

Note that this DMA socket would carry only carry commands in the server -> client direction.

I think I read "messages" here instead of "commands", hence the confusion. It's bi-directional, in fact.

So, yeah, that sounds reasonable.

jlevon · 2023-06-26T14:43:16Z

oops

jlevon · 2023-06-26T20:13:42Z

@mnissler-rivos I asked Jag about this on slack if you'd like to join

tmakatos · 2023-07-06T12:51:13Z

I've just started reading the rest of the discussion, but the following just occured to me:

There are cases where a libvfio-user public function (e.g. vfu_dma_read) sends a message to the client and then synchronously waits for the response.

What's the point of the vfio-user server requiring a response from the client for a DMA command? If the client doesn't like the command it received, the server doesn't need to know about that specific fact, the client might deal with it any way it sees fit (e.g. reset the device). So if we remove this behavior there's no problem in the first place, right?

jlevon · 2023-07-06T12:53:27Z

DMA read requires a response with the data

mnissler-rivos · 2023-07-06T12:54:15Z

What's the point of the vfio-user server requiring a response from the client for a DMA command?

For a DMA read, it wants the data from the client. There is no conceptual reason why that couldn't be delivered asynchronously though. In case of qemu, the device models often have code that perform DMA reads to get the next network packet and then process it, for example, and that code is mostly synchronous right now. So, IMHO we should definitely provide an async API, but keep a synchronous version to avoid breaking more simplistic server designs.

tmakatos · 2023-07-06T13:32:05Z

DMA read requires a response with the data

D'oh of course.

Furthermore, the server can choose to synchronously wait for the DMA operation to complete and ignore all other I/O on the main socket during that time (this is what I was referring to when I said this is simpler to implement).

What if the client never responds to the DMA read? What if it decides to reset the device, so the server should actually be paying attention to the main socket? I guess the client will have to start making assumptions that the server won't be paying attention to the main socket so it first has to fail the DMA read?

jlevon · 2023-07-06T13:35:45Z

Yep - we'll need to clearly define exact behaviour in these sorts of cases

tmakatos · 2023-07-06T13:42:08Z

I guess it boils down to what real HW would do: if it was in the process of doing DMA would it be handling MMIO in the first place?

mnissler-rivos · 2023-07-06T15:16:48Z

I don't think there's a universal answer on how hardware behaves. At the PCIe TLP level, all packets are tagged in order to match up completions, so there aren't any inherent restrictions. I'm also not aware of any requirements to that effect in the PCIe spec, but I'll ask a coworker who knows it much better than me to make sure. That said, the spec does include a completion timeout mechanism (section 2.8) to deal with situations where an expected reply never arrives.

From a VFIO software perspective, robust clients and servers should ideally not make any assumptions on communication patterns in my opinion. Having implemented a fully asynchronous client now, I am painfully aware though that this can be challenging in practice. Plus, you can't just mandate full async and ignore the practical constraints in existing software such as qemu.

Note that separating the sockets offer increased flexibility in implementing a correct client: It can spawn a thread to handle DMA separately from the main socket, and unless there is problematic cross-thread synchronization, everything will be fully async safe even without switching to an event-based design.

This change introduces a new mode for servers to access DMA regions in the client. It reuses the existing VFIO_USER_DMA_{READ,WRITE} message type and format, but messages are not passed across the main communication socket. Instead, communication happens via a separate socket that gets supplied by the client when invoking VFIO_USER_DMA_MAP. The benefit of the new mode is to avoid situations where both client and server send messages at the same time, then expect the response as the next message that is received. When socket-based DMA access mode is used, commands on sockets are only ever sent by one side, side-stepping the command collision problem. For more details and disucssion, please see nutanix#279.

jlevon · 2023-07-18T14:29:56Z

not sure I have an opinion on that

This change introduces a new mode for servers to access DMA regions in the client. It reuses the existing VFIO_USER_DMA_{READ,WRITE} message type and format, but messages are not passed across the main communication socket. Instead, communication happens via a separate socket that gets supplied by the client when invoking VFIO_USER_DMA_MAP. The benefit of the new mode is to avoid situations where both client and server send messages at the same time, then expect the response as the next message that is received. When socket-based DMA access mode is used, commands on sockets are only ever sent by one side, side-stepping the command collision problem. For more details and disucssion, please see nutanix#279.

mnissler-rivos · 2023-07-19T08:37:47Z

Here's a fresh snapshot of what I have right now: mnissler-rivos/libvfio-user@master...mnissler-rivos:libvfio-user:socket_dma

Includes test fixes and a pytest for vfu_sgl_{read,write}.

Note that this is still passing the socket fd in DMA_MAP, since I had this almost ready when we started discussing this angle yesterday. I will investigate how a solution that sets up a context-scoped server -> client socket next.

Happy to hear any input you may have, but no pressure, I still got work to do :-)

tmakatos · 2023-07-19T09:00:05Z

@mnissler-rivos I think it's mature enough to become a a PR (so we don't have to re-review it, I've already reviewed part of it), passing the sockect fd during negotiate can be fixed up by additional commits.

mnissler-rivos · 2023-07-19T09:39:12Z

@tmakatos I hear you - but please give me a day or so to investigate the context-scoped reverse socket approach.

There's a chance that with the alternative implementation I can hide all the details of the reverse socket in the transport layer. If that succeeds, most of the changes I have on the branch won't be necessary.

jraman567 · 2023-07-19T17:17:03Z

@mnissler-rivos Setting up the reverse communication channel during negotiation sounds good to me too. Thank you!

This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue nutanix#279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket.

mnissler-rivos · 2023-07-20T22:06:47Z

Here's a proof of concept for setting up a separate server->client command socket at negotiation time: mnissler-rivos/libvfio-user@master...mnissler-rivos:libvfio-user:cmd_socket

It ended up cleaner than I had feared, and it's future-proof in the sense that it transparently moves all server->client commands over to the separate socket.

I haven't finished a suitable pytest yet, but I have test against my client and confirmed that it works.

It'd be great if you could weigh in on whether this approach is indeed what you want to go with - I'll be happy to turn this into a proper PR then.

jlevon · 2023-08-02T11:48:43Z

That is indeed a lot less than I expected! I have some comments but looks broadly fine, please open 2 PRs for the separate changes, and don't forget to update the vfio-user spec too. Thanks!

This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue nutanix#279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket.

This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue nutanix#279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket. Corresponding replies are also passed back over the new socket pair.

mnissler-rivos · 2023-08-15T08:28:34Z

Sorry for the delay, I was on vacation. I've just created a pull request that sets up a separate socket pair at negotiation time, complete with tests and specification updates.

@jlevon you asked for 2PRs above - not sure whether you'd like me to split this one or see a separate PR for the alternative approach of setting up a socket FD for each DMA mapping?

jlevon · 2023-08-15T08:32:51Z

@mnissler-rivos close_safely() is unrelated to these changes so should be a separate PR (with the other on top I guess).

mnissler-rivos · 2023-08-15T08:40:05Z

Ah, makes sense. Here is a separate pull request for close_safely: #763

This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue nutanix#279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket. Corresponding replies are also passed back over the new socket pair. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>

Use separate socket for server->client commands This change adds support for a separate socket to carry commands in the server-to-client direction. It has proven problematic to send commands in both directions over a single socket, since matching replies to commands can become non-trivial when both sides send commands at the same time and adds significant complexity. See issue #279 for details. To set up the reverse communication channel, the client indicates support for it via a new capability flag in the version message. The server will then create a fresh pair of sockets and pass one end to the client in its version reply. When the server wishes to send commands to the client at a later point, it now uses its end of the new socket pair rather than the main socket. Corresponding replies are also passed back over the new socket pair. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>

jlevon · 2023-09-15T11:38:51Z

fixed via #762

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-Id: <20240212080617.2559498-5-mnissler@rivosinc.com>

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Reviewed-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-Id: <20240304100554.1143763-5-mnissler@rivosinc.com>

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Reviewed-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-Id: <20240507094210.300566-5-mnissler@rivosinc.com>

Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See nutanix/libvfio-user#279 for details as well as a proposed fix. Reviewed-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-Id: <20240507143431.464382-7-mnissler@rivosinc.com>

tmakatos added the bug Something isn't working label Jan 28, 2021

tmakatos mentioned this issue Feb 5, 2021

partial socket IO isn't handled #306

Open

jlevon mentioned this issue May 13, 2022

would like a new io_uring-based socket transport #684

Open

jlevon added the api-change label Jun 23, 2023

jlevon closed this as completed Jun 26, 2023

jlevon reopened this Jun 26, 2023

mnissler-rivos mentioned this issue Aug 15, 2023

Pass server->client command over a separate socket pair #762

Merged

jlevon closed this as completed Sep 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vfu_dma_read breaks if client sends message in between #279

vfu_dma_read breaks if client sends message in between #279

tmakatos commented Jan 28, 2021 •

edited

Loading

jlevon commented Mar 10, 2021

awarus commented Dec 28, 2021

tmakatos commented Jan 4, 2022

mnissler-rivos commented Jun 19, 2023

tmakatos commented Jun 20, 2023 •

edited

Loading

mnissler-rivos commented Jun 22, 2023

jlevon commented Jun 23, 2023

mnissler-rivos commented Jun 23, 2023

jlevon commented Jun 23, 2023

mnissler-rivos commented Jun 24, 2023

jlevon commented Jun 24, 2023

mnissler-rivos commented Jun 26, 2023

jlevon commented Jun 26, 2023

jlevon commented Jun 26, 2023

jlevon commented Jun 26, 2023

tmakatos commented Jul 6, 2023 •

edited

Loading

jlevon commented Jul 6, 2023

mnissler-rivos commented Jul 6, 2023 •

edited

Loading

tmakatos commented Jul 6, 2023

jlevon commented Jul 6, 2023

tmakatos commented Jul 6, 2023

mnissler-rivos commented Jul 6, 2023

jlevon commented Jul 18, 2023

mnissler-rivos commented Jul 19, 2023

tmakatos commented Jul 19, 2023

mnissler-rivos commented Jul 19, 2023

jraman567 commented Jul 19, 2023

mnissler-rivos commented Jul 20, 2023

jlevon commented Aug 2, 2023

mnissler-rivos commented Aug 15, 2023

jlevon commented Aug 15, 2023

mnissler-rivos commented Aug 15, 2023

jlevon commented Sep 15, 2023

vfu_dma_read breaks if client sends message in between #279

vfu_dma_read breaks if client sends message in between #279

Comments

tmakatos commented Jan 28, 2021 • edited Loading

jlevon commented Mar 10, 2021

awarus commented Dec 28, 2021

tmakatos commented Jan 4, 2022

mnissler-rivos commented Jun 19, 2023

tmakatos commented Jun 20, 2023 • edited Loading

mnissler-rivos commented Jun 22, 2023

jlevon commented Jun 23, 2023

mnissler-rivos commented Jun 23, 2023

jlevon commented Jun 23, 2023

mnissler-rivos commented Jun 24, 2023

jlevon commented Jun 24, 2023

mnissler-rivos commented Jun 26, 2023

jlevon commented Jun 26, 2023

jlevon commented Jun 26, 2023

jlevon commented Jun 26, 2023

tmakatos commented Jul 6, 2023 • edited Loading

jlevon commented Jul 6, 2023

mnissler-rivos commented Jul 6, 2023 • edited Loading

tmakatos commented Jul 6, 2023

jlevon commented Jul 6, 2023

tmakatos commented Jul 6, 2023

mnissler-rivos commented Jul 6, 2023

jlevon commented Jul 18, 2023

mnissler-rivos commented Jul 19, 2023

tmakatos commented Jul 19, 2023

mnissler-rivos commented Jul 19, 2023

jraman567 commented Jul 19, 2023

mnissler-rivos commented Jul 20, 2023

jlevon commented Aug 2, 2023

mnissler-rivos commented Aug 15, 2023

jlevon commented Aug 15, 2023

mnissler-rivos commented Aug 15, 2023

jlevon commented Sep 15, 2023

tmakatos commented Jan 28, 2021 •

edited

Loading

tmakatos commented Jun 20, 2023 •

edited

Loading

tmakatos commented Jul 6, 2023 •

edited

Loading

mnissler-rivos commented Jul 6, 2023 •

edited

Loading