-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(client): upload dataset items to data service via http using client #642
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few items that need addressing, and a couple additional, more minor comments/questions.
async def data_service_action(self, action: str, **kwargs) -> ResultIndicator: | ||
async def data_service_action( | ||
self, | ||
action: Literal["create", "delete", "upload", "download", "list", "items"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First and foremost, the relatively new state
option is missing.
More generally, I don't particularly like the use of Literal
here, for exactly the type of thing that's happened, though I may not have full appreciation for the benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch. I don't know how I missed state
.
More generally, I don't particularly like the use of
Literal
here, for exactly the type of thing that's happened, though I may not have full appreciation for the benefits.
The goal of using Literal
here is to improve the developer experience when using this api. I think this is especially important when developing client libraries. By using Literal
here a language server will populate the available options for the action
parameter in the developer's IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just addressed this.
@@ -225,7 +229,17 @@ def data_service_client(self) -> DataServiceClient: | |||
if self.client_config.data_service is not None and self.client_config.data_service.active: | |||
t_client_type = determine_transport_client_type(self.client_config.data_service.endpoint_protocol) | |||
t_client = t_client_type(**self.client_config.data_service.dict()) | |||
self._data_service_client = DataServiceClient(t_client, self._auth_client) | |||
# use http client where applicable when bypassing request service | |||
if self._bypass_request_service: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably going to need some modifications. As is, we won't get the extra HTTP client without the user manually specifying an extra CLI arg, and configuring the data service details will have the side effect of bypassing the request service for all data service communication. I expected that we wouldn't (necessarily) want to bypass the request server except for the actual data transfer.
This _bypass_request_service
attribute (and related CLI option) is a legacy thing that just never got taken out. It's False
unless the user specifically does something extra at runtime.
What replaced that legacy behavior was the CLI client moving to using a configuration class, and instead bypassing the request service (for data service actions) based on whether a self.client_config.data_service
value was available. So if the ClientConfig has a data_service
value, use that ConnectionConfig when lazily creating the data_service_client
property. Otherwise, create data_service_client
using the "regular" transport client, which was created from ClientConfig.request_service
.
Depending on how we want to go about things, we'll at least need to change where and how the conditional is used, and possibly more than that if we want to continue to support full bypassing of the request service (though maybe we don't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably going to need some modifications. As is, we won't get the extra HTTP client without the user manually specifying an extra CLI arg, and configuring the data service details will have the side effect of bypassing the request service for all data service communication. I expected that we wouldn't (necessarily) want to bypass the request server except for the actual data transfer.
I should have left a note in the code here why I chose to go this way. My thinking was this was the most sensible default for the moment given where we are in the development lifecycle. Namely, in our deployment architecture with the request
service acting as the main entry point to the deployment, at this point in time, we don't have a way to reach the http
data service endpoints through the request
service or via proxy. So, the only way to reach the http
data service endpoints are by directly reaching out to the data service.
Once this is no longer the case, we will need to modify this section of the client code to remove this condition. Thoughts on that?
#338 is also related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part that I am having trouble with is the de facto bypassing for any data service communication. A necessary (but not sufficient) condition for HTTP uploads produces side effects for more than just uploads. To work around them, we'd have to go back and edit the config (or manually toggle between configs).
I don't think we want this, and I think it should be fairly easy to avoid. Any reason not to also make the outer condition dependent on self._bypass_request_service
? Or, perhaps more generally, why not (at least for now) change the request service bypassing functionality to once again require an explicit CLI arg, and also make use of HTTP uploads dependent on having bypassed the request service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you are saying. I will just remove the check conditional check on self._bypass_request_service
and use the http client if the configuration file is configured to bypass the request service. Sorry, not enough coffee on a Friday, I guess 😅!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just addressed this
transport_client: TransportLayerClient, | ||
auth_client: Optional[AuthClient] = None, | ||
*args, | ||
http_client: Optional[aiohttp.ClientSession] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason this was ordered this way with respect to *args
and **kwargs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, great question! This forces http_client
to be provided as a keyword argument. If there were to be a subclass of DataServiceClient
that added arguments to it's __init__
method, adding the http_client
as a keyword or positional argument is a breaking change. Subclasses that add a positional / keyword argument to their __init__
method could be now inadvertently mis-initialized.
90cf341
to
1620e66
Compare
dmod.client
client will upload dataset items to data service via http (instead of websocket connection) if the client is configured to bypass the request service. In the future this option will become more explicit when it is possible to upload to the data service without needing to bypass the request service.closes #599
Changes
dmod.client
client will upload dataset items to data service via http (instead of websocket connection) if the client is configured to bypass the request service.Note
I decided to take a very direct approach in implementing this feature (e.g. existing abstractions were not expanded / implemented). In discussing how we might implement this feature with @robertbartel, it became clear that the
TransportLayerClient
abstraction is not well suited to handle the request-response paradigm. Additionally, bloating that base class with methods that would enable a request-response paradigm is not desirable. It seems clear that careful thought is required to support communication via multiple paradigms. For the time being, a concrete approach will suffice until a more general approach is required.