Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bridge: Improve handling of control messages #19637

Merged

Conversation

allisonkarlitskaya
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya commented Nov 21, 2023

Remove the general purpose mechanism for sending JSON-formatted messages from the protocol level, leaving only the mechanism for sending control messages. Non-control JSON-formatted messages (such as those used for D-Bus replies) are now handled at the channel level and treated as normal data, subject to flow control. This removes one of the main interaction points between endpoints and the router.

Add a pair of functions to jsonutil which define the exact way in which we intend to handle keyword args when building control messages. We make a clear distinction between data which is meant to be handled "verbatim" and data which is meant to be subject to processing rules
(our '_' to '-' replacements, etc).

Start using the new jsonutil functions for all control message handling. That means that all control-message-creating paths now offer the opportunity to pass a verbatim dictionary as well as a set of kwargs. Propagate this change upwards, throughout. For forwarding messages from peers this is substantially cleaner because it means we don't rewrite their messages. The improved typing results in some extra mypy errors which requires some hinting at call sites.

One note: ideally, we'd use / in the argument list of all of these functions to ensure that the "verbatim object" field that we add everywhere can only be passed positionally. This is unfortunately only in Python 3.8. To work around that issue, we add a _msg field everywhere with its name chosen never to clash with an actual kwarg we might use (such as message). Unfortunately, mypy isn't yet convinced that this is a purely positional argument, and in places that we write **kwargs it has no way to know that one of the kwargs won't in fact be _msg, so it complains that the types don't match. In the cases where that happens, we can manually specify None to avoid that problem.

Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I (think I) understand the purpose and direction of this PR, but I still have some questions about details.

"request changes" for the unported exc.kwargs bits.

src/cockpit/protocol.py Outdated Show resolved Hide resolved
src/cockpit/protocol.py Show resolved Hide resolved
src/cockpit/protocol.py Outdated Show resolved Hide resolved
def print_object(message: 'JsonObject | None', kwargs: JsonObject) -> bytes:
"""Construct and pretty-print a JSON object based on message and kwargs

If only message is given, it is pretty-printed, unmodified. If message is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, "unmodified" -- systemd_ctype's encoder encodes Variants to a t/v object and bytes to base64. AFAICS this is only ever used for channel data (in Channel.send_message()), and control messages are left unmodified. This feels upside down to me -- isn't it the data that we want to pass through unmodified, and we may want to prettify/interpret the control messages?

Can it ever happen that we get GVariant objects in data messages ? I don't think there's any way to create them in JS, and in Python you'd need gi.Glib. I figure binary data is a thing, though. So this is using systemd_ctypes' encoder for re-using the binary base64 encoder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a very interesting point, and it's an unintended positive side-effect of all of this work. Now that we no longer shove D-Bus data through the same channels as control messages, the control message formatting code no longer needs to be able to handle D-Bus types. We can implement that at the channel level or (better) at the D-Bus level itself, I suppose. Then we could completely decouple this low-level code from systemd_ctypes. That would be a very nice cleanup, indeed.

I'll look into this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question above was too vague and I suppose I was misled. The one below is probably a better starting point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I suppose what's left here is to clarify the "unmodified" in the comment. Also, is it actually a thing to get binary data anywhere in our wire protocol API which is part of a key/value object? (I.e. not .send_data())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your idea here has piqued my curiosity. I'm kinda interesting in pursuing it.

By and large, the overwhelming user of Channel.send_message() is the dbus channel. It uses it ~20 times, in many cases needing to send D-Bus data.

There are other users of JSON-as-payload (filesystem events, metrics updates, packages webserver) which are using plain data.

I have an idea to add a json encoder class attribute to the Channel class and use it for formatting the JSON-as-payload messages. The DBus channel could then override this.

I notice just now that the packages HTTP backend sends its status message as payload, but the HTTP backend sends a control message. I'm going to look into what's going on there. I'm surprised that they're different.

Copy link
Member Author

@allisonkarlitskaya allisonkarlitskaya Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting — this is a protocol difference in http-stream1 (which -ws sends) and http-stream2 (which is used from cockpit.js).

src/cockpit/channel.py Outdated Show resolved Hide resolved
@allisonkarlitskaya allisonkarlitskaya force-pushed the message-cleanup branch 4 times, most recently from 8fd01b7 to 6314638 Compare November 22, 2023 15:46
Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love what you did to the HttpStream channel, thanks for porting this!

I have one question about the "raw" thing (which is probably/hopefully just me having forgotten our previous conclusion), and the remaining print_object() docstring thread, but you already said that you want to massage this a bit anyway.

That said, I'm happy enough with this to approve if you want to land it as-is, and do a follow-up instead. Cheers!

# Never send these headers
remove = {'Connection', 'Transfer-Encoding'}

if binary != 'raw':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, while we are at the subject of making things compatible with the C bridge: that never does any string comparison to the "binary" field, in particular not "raw". It just checks its presence:

self->binary = json_object_has_member (options, "binary");

The documentation says "set to true", i.e. considering it a boolean field. I have a very strong déjà vu (or rather, déjà débattu), but I don't remember the outcome any more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to debate:

docs: https://github.com/cockpit-project/cockpit/blob/main/doc/protocol.md#command-open

  • "binary": If present set to "raw"

cockpit.js: https://github.com/cockpit-project/cockpit/blob/main/pkg/lib/cockpit.js#L849

        if (binary)
            command.binary = "raw";

I'm happy enough to change it to a presence check if you'd prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's what I was missing. So the C bridge is just too sloppy about this then. TBH I don't even know why we have to drop Content-Length for raw streams, so I don't have a preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to drop it for non-raw streams because Content-Length is binary data, but for text-based streams, we don't know in advance how many characters we'll pull out of the utf-8 data.

src/cockpit/channels/http.py Outdated Show resolved Hide resolved
Remove the general purpose mechanism for sending JSON-formatted
messages from the protocol level, leaving only the mechanism for sending
control messages.  Non-control JSON-formatted messages (such as those
used for D-Bus replies) are now handled at the channel level and treated
as normal data, subject to flow control.  The function for doing so has
been renamed from send_message() to send_json() for clarity, and returns
boolean (with the same meaning as send_data()), although nobody
currently pays any attention to this. This removes one of the main
interaction points between endpoints and the router.

send_json() uses a class attribute defined on Channel to encode the JSON
data.  By default, this is the default json.JSONEncoder, but channel
subclasses can provide their own encoder (D-Bus needs the encoder
provided by systemd_ctypes).

This is all driven by a new function in jsonutil which defines the exact
way in which we intend to handle keyword args when building control
messages.  We make a clear distinction between data which is meant to be
handled "verbatim" and data which is meant to be subject to processing
rules (our '_' to '-' replacements, etc).

Start using the new jsonutil function for all control message handling.
That means that all control-message-creating paths now offer the
opportunity to pass a verbatim dictionary as well as a set of kwargs.
Propagate this change upwards, throughout. For forwarding messages from
peers this is substantially cleaner because it means we don't rewrite
their messages.  The improved typing results in some extra mypy errors
which requires some hinting at call sites.

One note: ideally, we'd use `/` in the argument list of all of these
functions to ensure that the "verbatim object" field that we add
everywhere can only be passed positionally.  This is unfortunately only
in Python 3.8.  To work around that issue, we add a `_msg` field
everywhere with its name chosen never to clash with an actual kwarg we
might use (such as `message`).  Unfortunately, `mypy` isn't yet
convinced that this is a purely positional argument, and in places that
we write `**kwargs` it has no way to know that one of the kwargs won't
in fact be `_msg`, so it complains that the types don't match.  In the
cases where that happens, we can manually specify `None` to avoid that
problem.
This fixes many issues, including:

- Based on AsyncChannel: The new code is based on AsyncChannel and
handles blocking operations by using .run_in_executor() instead of
creating a working thread.  All control flow now happens in a
straight line in the run() function.

- Implements flow control: The previous code would read many small
blocks in a thread and flood the main thread with requests to write
them through immediately, without any attempt at flow control.  The
new code will block if the AsyncChannel send queue backs up.

- Better type safety: we use our new JSON helpers to do this.  The
argument parsing is also less redundant in general since we don't
have to do up-front checking and then look everything up again later
in the worker thread.

- Fewer manual exit paths: use ChannelError instead of manual closes.

- Fix header handling: The header handling was incorrect before,
compared with the C bridge: we were supposed to always remove
'Connection' and 'Transfer-Encoding', and to remove 'Content-Length'
and 'Range' for non-binary channels, but we ended up getting the
logic wrong, removing 'Content-Length' and 'Range' for binary
channels and 'Conntection' and 'Transfer-Encoding' for text.  The
behaviour now matches the C bridge.

- Safety fixes: The previous code had at least one case of unsafely
sending data from the worker thread directly to the channel.  This is
no longer possible under the new design.
These two exception types are extremely similar and would benefit from
being unified.  In order to do that, we need to make `Channel.close()`
operate on a dictionary (as is found in CockpitProblem.attrs) rather
than a set of kwargs (as was previously stored on ChannelError).

Add a couple of type signatures around to help us make sure we get this
right.
@martinpitt
Copy link
Member

That rawhide failure is entirely unrelated, apparently the CLI tool output just changed a bit:

'cockpit; type web' not found in '220 - alice (1003)\n Since: Wed 2023-11-22 17:16:22 UTC; 408ms ago\n State: active\n Leader: 131668 (cockpit-session)\n Seat: \n TTY: web console\n Remote: ::1\nService: cockpit\n Type: web\n Class: user\n Idle: no\n Unit: session-220.scope\n ├─131668 /usr/libexec/cockpit-session localhost\n └─131688 /usr/bin/python3 /usr/bin/cockpit-bridge\n\nNov 22 17:16:22 ip-172-31-26-180.us-east-2.compute.internal systemd[1]: Started session-220.scope - Session 220 of User alice.\n'

I'll send a fix tomorrow.

Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely! 😻

@allisonkarlitskaya allisonkarlitskaya merged commit 8295258 into cockpit-project:main Nov 22, 2023
92 of 93 checks passed
@allisonkarlitskaya allisonkarlitskaya deleted the message-cleanup branch November 22, 2023 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants