-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bridge: Improve handling of control messages #19637
bridge: Improve handling of control messages #19637
Conversation
fcb6fa4
to
3ec3155
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I (think I) understand the purpose and direction of this PR, but I still have some questions about details.
"request changes" for the unported exc.kwargs
bits.
src/cockpit/jsonutil.py
Outdated
def print_object(message: 'JsonObject | None', kwargs: JsonObject) -> bytes: | ||
"""Construct and pretty-print a JSON object based on message and kwargs | ||
|
||
If only message is given, it is pretty-printed, unmodified. If message is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, "unmodified" -- systemd_ctype's encoder encodes Variants to a t/v object and bytes to base64. AFAICS this is only ever used for channel data (in Channel.send_message()
), and control messages are left unmodified. This feels upside down to me -- isn't it the data that we want to pass through unmodified, and we may want to prettify/interpret the control messages?
Can it ever happen that we get GVariant objects in data messages ? I don't think there's any way to create them in JS, and in Python you'd need gi.Glib. I figure binary data is a thing, though. So this is using systemd_ctypes' encoder for re-using the binary base64 encoder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a very interesting point, and it's an unintended positive side-effect of all of this work. Now that we no longer shove D-Bus data through the same channels as control messages, the control message formatting code no longer needs to be able to handle D-Bus types. We can implement that at the channel level or (better) at the D-Bus level itself, I suppose. Then we could completely decouple this low-level code from systemd_ctypes. That would be a very nice cleanup, indeed.
I'll look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question above was too vague and I suppose I was misled. The one below is probably a better starting point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I suppose what's left here is to clarify the "unmodified" in the comment. Also, is it actually a thing to get binary data anywhere in our wire protocol API which is part of a key/value object? (I.e. not .send_data()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your idea here has piqued my curiosity. I'm kinda interesting in pursuing it.
By and large, the overwhelming user of Channel.send_message()
is the dbus channel. It uses it ~20 times, in many cases needing to send D-Bus data.
There are other users of JSON-as-payload (filesystem events, metrics updates, packages webserver) which are using plain data.
I have an idea to add a json encoder class attribute to the Channel class and use it for formatting the JSON-as-payload messages. The DBus channel could then override this.
I notice just now that the packages HTTP backend sends its status message as payload, but the HTTP backend sends a control message. I'm going to look into what's going on there. I'm surprised that they're different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting — this is a protocol difference in http-stream1
(which -ws
sends) and http-stream2
(which is used from cockpit.js
).
8fd01b7
to
6314638
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love what you did to the HttpStream channel, thanks for porting this!
I have one question about the "raw" thing (which is probably/hopefully just me having forgotten our previous conclusion), and the remaining print_object()
docstring thread, but you already said that you want to massage this a bit anyway.
That said, I'm happy enough with this to approve if you want to land it as-is, and do a follow-up instead. Cheers!
# Never send these headers | ||
remove = {'Connection', 'Transfer-Encoding'} | ||
|
||
if binary != 'raw': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, while we are at the subject of making things compatible with the C bridge: that never does any string comparison to the "binary" field, in particular not "raw"
. It just checks its presence:
self->binary = json_object_has_member (options, "binary");
The documentation says "set to true", i.e. considering it a boolean field. I have a very strong déjà vu (or rather, déjà débattu), but I don't remember the outcome any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to debate:
docs: https://github.com/cockpit-project/cockpit/blob/main/doc/protocol.md#command-open
- "binary": If present set to "raw"
cockpit.js: https://github.com/cockpit-project/cockpit/blob/main/pkg/lib/cockpit.js#L849
if (binary)
command.binary = "raw";
I'm happy enough to change it to a presence check if you'd prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's what I was missing. So the C bridge is just too sloppy about this then. TBH I don't even know why we have to drop Content-Length for raw streams, so I don't have a preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to drop it for non-raw streams because Content-Length is binary data, but for text-based streams, we don't know in advance how many characters we'll pull out of the utf-8 data.
Remove the general purpose mechanism for sending JSON-formatted messages from the protocol level, leaving only the mechanism for sending control messages. Non-control JSON-formatted messages (such as those used for D-Bus replies) are now handled at the channel level and treated as normal data, subject to flow control. The function for doing so has been renamed from send_message() to send_json() for clarity, and returns boolean (with the same meaning as send_data()), although nobody currently pays any attention to this. This removes one of the main interaction points between endpoints and the router. send_json() uses a class attribute defined on Channel to encode the JSON data. By default, this is the default json.JSONEncoder, but channel subclasses can provide their own encoder (D-Bus needs the encoder provided by systemd_ctypes). This is all driven by a new function in jsonutil which defines the exact way in which we intend to handle keyword args when building control messages. We make a clear distinction between data which is meant to be handled "verbatim" and data which is meant to be subject to processing rules (our '_' to '-' replacements, etc). Start using the new jsonutil function for all control message handling. That means that all control-message-creating paths now offer the opportunity to pass a verbatim dictionary as well as a set of kwargs. Propagate this change upwards, throughout. For forwarding messages from peers this is substantially cleaner because it means we don't rewrite their messages. The improved typing results in some extra mypy errors which requires some hinting at call sites. One note: ideally, we'd use `/` in the argument list of all of these functions to ensure that the "verbatim object" field that we add everywhere can only be passed positionally. This is unfortunately only in Python 3.8. To work around that issue, we add a `_msg` field everywhere with its name chosen never to clash with an actual kwarg we might use (such as `message`). Unfortunately, `mypy` isn't yet convinced that this is a purely positional argument, and in places that we write `**kwargs` it has no way to know that one of the kwargs won't in fact be `_msg`, so it complains that the types don't match. In the cases where that happens, we can manually specify `None` to avoid that problem.
This fixes many issues, including: - Based on AsyncChannel: The new code is based on AsyncChannel and handles blocking operations by using .run_in_executor() instead of creating a working thread. All control flow now happens in a straight line in the run() function. - Implements flow control: The previous code would read many small blocks in a thread and flood the main thread with requests to write them through immediately, without any attempt at flow control. The new code will block if the AsyncChannel send queue backs up. - Better type safety: we use our new JSON helpers to do this. The argument parsing is also less redundant in general since we don't have to do up-front checking and then look everything up again later in the worker thread. - Fewer manual exit paths: use ChannelError instead of manual closes. - Fix header handling: The header handling was incorrect before, compared with the C bridge: we were supposed to always remove 'Connection' and 'Transfer-Encoding', and to remove 'Content-Length' and 'Range' for non-binary channels, but we ended up getting the logic wrong, removing 'Content-Length' and 'Range' for binary channels and 'Conntection' and 'Transfer-Encoding' for text. The behaviour now matches the C bridge. - Safety fixes: The previous code had at least one case of unsafely sending data from the worker thread directly to the channel. This is no longer possible under the new design.
These two exception types are extremely similar and would benefit from being unified. In order to do that, we need to make `Channel.close()` operate on a dictionary (as is found in CockpitProblem.attrs) rather than a set of kwargs (as was previously stored on ChannelError). Add a couple of type signatures around to help us make sure we get this right.
6314638
to
53a1cdb
Compare
That rawhide failure is entirely unrelated, apparently the CLI tool output just changed a bit:
I'll send a fix tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely! 😻
8295258
into
cockpit-project:main
Remove the general purpose mechanism for sending JSON-formatted messages from the protocol level, leaving only the mechanism for sending control messages. Non-control JSON-formatted messages (such as those used for D-Bus replies) are now handled at the channel level and treated as normal data, subject to flow control. This removes one of the main interaction points between endpoints and the router.
Add a pair of functions to jsonutil which define the exact way in which we intend to handle keyword args when building control messages. We make a clear distinction between data which is meant to be handled "verbatim" and data which is meant to be subject to processing rules
(our '_' to '-' replacements, etc).
Start using the new jsonutil functions for all control message handling. That means that all control-message-creating paths now offer the opportunity to pass a verbatim dictionary as well as a set of kwargs. Propagate this change upwards, throughout. For forwarding messages from peers this is substantially cleaner because it means we don't rewrite their messages. The improved typing results in some extra mypy errors which requires some hinting at call sites.
One note: ideally, we'd use
/
in the argument list of all of these functions to ensure that the "verbatim object" field that we add everywhere can only be passed positionally. This is unfortunately only in Python 3.8. To work around that issue, we add a_msg
field everywhere with its name chosen never to clash with an actual kwarg we might use (such asmessage
). Unfortunately,mypy
isn't yet convinced that this is a purely positional argument, and in places that we write**kwargs
it has no way to know that one of the kwargs won't in fact be_msg
, so it complains that the types don't match. In the cases where that happens, we can manually specifyNone
to avoid that problem.