Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support Multijoin in the Python client #6020

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jmao-denver
Copy link
Contributor

Fixes #5884

Comment on lines 780 to 782
table: Table
on: Union[str, Sequence[str]]
joins: Union[str, Sequence[str]] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not present on the server MultiJoinInput. Should they be made private? Should they be added to the server side?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server side is a wrapper around a Java MJI. I'm ambivalent about whether these are user-facing, but they should not be mutable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still an incompatibility in the API.

@@ -734,3 +737,60 @@ def plugin_client(self, exportable_obj: ticket_pb2.TypedTicket) -> PluginClient:

Part of the experimental plugin API."""
return PluginClient(self, exportable_obj)

def multi_join(self, input: Union[Table, Sequence[Table], MultiJoinInput, Sequence[MultiJoinInput]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users have very very strongly wanted the server and client APIs to be the same. This has a number of deviations compared to the server that I can see at a glance.

  1. multi_join is a method on session instead of just being a function.
  2. multi_join is in a different submodule (session instead of table).
  3. The return type is Table instead of MultiJoinTable.

How similar can the APIs be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other clients have taken the approach of returning Table for now, instead of MultiJoinTable. MJT is more of a future-looking API, and given that its potential (adding more joins to an existing MJ and thus getting out a new Table) hasn't been realized yet there's no real benefit to building an object type plugin for MJT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the new method to table.py as a function and have also relocated the MultiJoinInput class there as well.

@chipkent I totally understand the desire to have the interface be the same for the two APIs, but the client isn't limited to working with just one server and making a 'session' level function a method on the Session object has the benefit of clear isolation/readability etc. So this change has a trade-off which I am quite ambivalent about.

py/client/pydeephaven/session.py Outdated Show resolved Hide resolved
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at the current state of the PR at a high level.

Comment on lines +833 to +834
def multi_join(input: Union[Table, Sequence[Table], MultiJoinInput, Sequence[MultiJoinInput]],
on: Union[str, Sequence[str]] = None) -> Table:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type here is Table where the deephaven method returns MultiJoinTable. MultiJoinTable does not inherit from Table and has a different API. Users of MultiJoinTable must call table() to get a table to use. I think I complained about this during that review. Without adding a MultiJoinTable on the client side, the APIs are incompatible.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 780 to 782
table: Table
on: Union[str, Sequence[str]]
joins: Union[str, Sequence[str]] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still an incompatibility in the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python Client MultiJoin
3 participants