Skip to content

Clipboard

Brent Baccala edited this page Sep 14, 2022 · 1 revision

--- X11 protocol === VNC protocol

 < X11 Clients >
 < X11 Clients >  ---- < Desktop X11 Server > === < ssvncviewer > ---- < Multiplex X11 Server > === < noVNC client >
 < X11 Clients >             < dummy client >                                  < dummy client >

Whenever the user copies text on his laptop, the React component calls rfb.clipboardPasteFrom, which either sends an extendedClipboardNotify or a clientCutText to the Multiplex X11 Server. We need to immediately relay this on to the Desktop X11 Server. The Multiplex Server's dummy client grabs the selection. If the ssvncviewer held the selection, it detects this, queries for the selection and sends it via VNC extendedClipboardNotify or clientCutText to the Desktop X11 Server, which grabs the selection and makes it available to the X11 Clients.

Whenever the user copies text on his remote desktop, the X11 client grabs the selection. The Desktop X11 Server probes it for formats. Once it gets an acceptable format (STRING or UTF8-STRING), it announces to ssvncviewer, obtains the data from the X11 client (which continues to hold the selection), and transfers it to ssvncviewer. The ssvncviewer grabs the selection on the Multiplex X11 Server and, again, now the Multiplex X11 Server gets the clipboard data and transfers it to the noVNC client. At the end of this process, the X11 Client holds the selection on the Desktop X11 Server, and ssvncviewer holds the selection on the Multiplex X11 Server.

What if the X11 Client now changes the selection? It still holds the selection on the Desktop X11 Server, so it does nothing. This is unacceptable because the text needs to be transferred to ssvncviewer.

To ensure that copies on the remote desktop get relayed through, the dummy client on the Desktop X11 Server needs to be greedy about getting the selection back so it can pick up a notification from the X11 Clients when they change their selection, and the dummy client on the Multiplex X11 Server needs to be greedy, too. This is how a Clipboard Manager behaves.

There's a "Clipboard Manager Specification". Only a single clipboard manager client can be operating at any one time, because a clipboard manager is greedy about always wanting to claim ownership of the clipboard selection. The clipboard manager looks for a selection named CLIPBOARD_MANAGER and exits if it already exists (this is how xclipboard behaves).

To ensure that copies on the laptop get relayed through, the X11 servers can't cache the value of the selection. Instead, when the X11 client requests the value of the selection, the dummy client on the Desktop X11 Server needs to request the clipboard value via VNC. Extended clipboard lets us do this, as it supports a "request" message that is answered with a "provide" messsage.

Standard clipboard only supports a ServerCutText announcement. This is adequate if there is no Multiplex layer:

 < X11 Clients >
 < X11 Clients >  ---- < Desktop X11 Server > === < noVNC client >
 < X11 Clients >             < dummy client >

In this situation, the Desktop X11 server always gets a message from the noVNC client when the laptop's clipboard changes, which causes the dummy client to grab the selection. Any futher changes get stored by the dummy client without any selection changes. Then, when an X11 client requests the selection, it gets the current value.

What happens if an X11 client owns a changing selection in the no Multiplex case? Then the noVNC client better support extended clipboard, so that it requests the clipboard and that causes the dummy client to request the current value of the selection. noVNC does support Extended Clipboard. Everytime it gets a Notify, it sends a Request, and when it gets a Provide, it dispatches a "clipboard" event.

Whenever an X11 client grabs the selection, the server / dummy client probes for TARGETS and if it gets a valid target, sends a Notify to the clients, which causes noVNC to send a VNC request, which triggers a X11 selection request. When the X11 client reports back its selection, the server sends the clipboard data to noVNC, and the X11 client maintains the selection.

Yet, without the Multiplex layer in the middle, somehow this works, because the remote desktop's enclosing

generates a onMouseEnter event every time the mouse is clicked in the remote desktop window, which I've set to read the laptop's clipboard and transfer it to the X11 Server. Effectively, every time you click on the remote desktop, it causes the dummy client to grab the selection. This happens because a mouseEnter event fires every time you click on the desktop. I'm thinking that this happens because of the following code in noVNC/core/rfb.js:
    // Always grab focus on some kind of click event
    this._canvas.addEventListener("mousedown", this._eventHandlers.focusCanvas);

So, everytime you click on the mouse, focus() is called. But calling focus() from the console doesn't produce mouseEnter.

How does gnome-terminal make this work? It's a gtk app, and if we look in gtkclipboard.c, we find the following comment:

/* This function makes a very good guess at what the correct

  • timestamp for a selection request should be. If there is
  • a currently processed event, it uses the timestamp for that
  • event, otherwise it uses the current server time. However,
  • if the time resulting from that is older than the time used
  • last time, it uses the time used last time instead.
  • In order implement this correctly, we never use CurrentTime,
  • but actually retrieve the actual timestamp from the server.
  • This is a little slower but allows us to make the guarantee
  • that the times used by this application will always ascend
  • and we won’t get selections being rejected just because
  • we are using a correct timestamp from an event, but used
  • CurrentTime previously. */

So gtk never uses CurrentTime.

      timestamp = gdk_x11_get_server_time (gtk_widget_get_window (clipboard_widget));

like this:

static Bool timestamp_predicate (Display *display, XEvent *xevent, XPointer arg) { Window xwindow = GPOINTER_TO_UINT (arg); GdkDisplay *gdk_display = gdk_x11_lookup_xdisplay (display);

if (xevent->type == PropertyNotify && xevent->xproperty.window == xwindow && xevent->xproperty.atom == gdk_x11_get_xatom_by_name_for_display (gdk_display, "GDK_TIMESTAMP_PROP")) return True;

return False; }

guint32 gdk_x11_get_server_time (GdkWindow *window) { Display *xdisplay; Window xwindow; guchar c = 'a'; XEvent xevent; Atom timestamp_prop_atom;

g_return_val_if_fail (GDK_IS_WINDOW (window), 0); g_return_val_if_fail (!GDK_WINDOW_DESTROYED (window), 0);

xdisplay = GDK_WINDOW_XDISPLAY (window); xwindow = GDK_WINDOW_XID (window); timestamp_prop_atom = gdk_x11_get_xatom_by_name_for_display (GDK_WINDOW_DISPLAY (window), "GDK_TIMESTAMP_PROP");

XChangeProperty (xdisplay, xwindow, timestamp_prop_atom, timestamp_prop_atom, 8, PropModeReplace, &c, 1);

XIfEvent (xdisplay, &xevent, timestamp_predicate, GUINT_TO_POINTER(xwindow));

return xevent.xproperty.time; }

This is the same basic approach suggested in https://stackoverflow.com/questions/61849695

Now, when the user copies text on his laptop, an VNC exchange transfers it to the Multiplex X11 Server, whose dummy client then grabs the selection. This causes the vncviewer to lose the selection. When the vncviewer loses the selection, it should notify (via VNC) the Desktop X11 server that it has selection data available via announceClipboard. Then the desktop server's will grab local ownership of the selection (this is already in the code for enhanced clipboard clients).

Clone this wiki locally