Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the twin-socket feature in the spec #775

Merged
merged 8 commits into from
Sep 15, 2023
76 changes: 68 additions & 8 deletions docs/vfio-user.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,12 +204,32 @@ A server can serve:
1) one or more clients, and/or
2) one or more virtual devices, belonging to one or more clients.

The current protocol specification requires a dedicated socket per
client/server connection. It is a server-side implementation detail whether a
single server handles multiple virtual devices from the same or multiple
clients. The location of the socket is implementation-specific. Multiplexing
clients, devices, and servers over the same socket is not supported in this
version of the protocol.
The current protocol specification requires dedicated sockets per
client/server connection. Commands in the client-to-server direction are
handled on the main communication socket which the client connects to, and
replies to these commands are passed on the same socket. Commands sent in the
other direction from the server to the client as well as their corresponding
replies can optionally be passed across a separate socket, which is set up
during negotiation (AF_UNIX servers just pass the file descriptor).

Using separate sockets for each command channel avoids introducing an
artificial point of synchronization between the channels. This simplifies
implementations since it obviates the need to demultiplex incoming messages
into commands and replies and interleave command handling and reply processing.
Note that it is still illegal for implementations to stall command or reply
processing indefinitely while waiting for replies on the other channel, as this
may lead to deadlocks. However, since incoming commands and requests arrive on
different sockets, it's possible to meet this requirement e.g. by running two
independent request processing threads that can internally operate
synchronously. It is expected that this is simpler to implement than fully
asynchronous message handling code. Implementations may still choose a fully
asynchronous, event-based design for other reasons, and the protocol fully
supports it.

It is a server-side implementation detail whether a single server handles
multiple virtual devices from the same or multiple clients. The location of the
socket is implementation-specific. Multiplexing clients, devices, and servers
over the same socket is not supported in this version of the protocol.

Authentication
--------------
Expand Down Expand Up @@ -503,6 +523,10 @@ Capabilities:
| migration | object | Migration capability parameters. If missing |
| | | then migration is not supported by the sender. |
+--------------------+--------+------------------------------------------------+
| twin_socket | object | Parameters for twin-socket mode, which handles |
| | | server-to-client commands and their replies on |
| | | a separate socket. Optional. |
+--------------------+--------+------------------------------------------------+

The migration capability contains the following name/value pairs:

Expand All @@ -513,12 +537,44 @@ The migration capability contains the following name/value pairs:
| | | between the client and the server is used. |
+--------+--------+-----------------------------------------------+

The ``twin_socket`` capability object holds these name/value pairs:

+----------+---------+--------------------------------------------------------+
| Name | Type | Description |
+==========+=========+========================================================+
| enable | boolean | Indicates whether the client wants to enable |
| | | twin-socket mode. Optional, defaults to false, only |
| | | valid in the request message. |
+----------+---------+--------------------------------------------------------+
| fd_index | number | Specifies an index in the file descriptor array |
| | | included with the message. The designated file |
| | | descriptor is a socket which is to be used for the |
| | | server-to-client command channel. Optional, only valid |
| | | in the reply message. |
+----------+---------+--------------------------------------------------------+

Reply
^^^^^

The same message format is used in the server's reply with the semantics
described above.

If and only if the client has requested to enable twin-socket mode by setting
mnissler-rivos marked this conversation as resolved.
Show resolved Hide resolved
``twin_socket.enable`` to true in its capabilities, the server may optionally
set up a separate command channel for server-to-client commands and their
replies. The server enables twin-socket mode as follows:

* Create a fresh socket pair.
* Keep the server end of the socket pair and pass the client end in the file
descriptor array included with the reply message.
* Indicate the index in the file descriptor array by the
``twin_socket.fd_index`` capability field in the reply, so the client can
identify the correct file descriptor to use.

The twin-socket feature is optional, so some servers may not support it.
However, for server implementations that do send server-to-client commands it is
strongly recommended to implement twin-socket support.

``VFIO_USER_DMA_MAP``
---------------------

Expand Down Expand Up @@ -1399,7 +1455,9 @@ Reply
-----------------------

If the client has not shared mappable memory, the server can use this message to
read from guest memory.
read from guest memory. This message and its reply are passed over the separate
server-to-client socket if twin-socket mode has been negotiated during
connection setup.

Request
^^^^^^^
Expand Down Expand Up @@ -1437,7 +1495,9 @@ Reply
-----------------------

If the client has not shared mappable memory, the server can use this message to
write to guest memory.
write to guest memory. This message and its reply are passed over the separate
server-to-client socket if twin-socket mode has been negotiated during
connection setup.

Request
^^^^^^^
Expand Down