-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compute: simplify mz_join_core result computation #19667
compute: simplify mz_join_core result computation #19667
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed from my side, thanks!
On request from @antiguru I opened a backport PR for DD as well: TimelyDataflow/differential-dataflow#392 |
Previously, the mz_join_core code wrapped the `result` closure into another `work_result` closure, to deal with the fact that matches from the `todo1` list had their value and diff fields swapped. Instead of having a closure to unswap the fields, we can simply pass them to the `Deferred` constructor in the correct order instead. That is, the `Deferred` constructor should always receive the cursor/storage from input 1 first and from input 2 second. If we make this change, no `work_result` closure is necessary and the code become easier to reason about.
870de8c
to
8b6451b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this mostly looks good. One thing I'm not 100% sure about is renaming trace/batch to cursor 1/2. Can you confirm that they're used symmetrically, i.e., we could swap the arguments and produce the same result?
Here are the things that changes for new batches received from input 1 (nothing changes for batches from input 2):
|
Previously, the
mz_join_core
code wrapped theresult
closure into anotherwork_result
closure, to deal with the fact that matches from thetodo1
list had their value and diff fields swapped.Instead of having a closure to unswap the fields, we can simply pass them to the
Deferred
constructor in the correct order instead. That is, theDeferred
constructor should always receive the cursor/storage from input 1 first and from input 2 second. If we make this change, nowork_result
closure is necessary and the code become easier to reason about.Motivation
This is something I stumbled over while working on support for a fueled merge join strategy.
Tips for reviewer
Part of this PR is renaming the fields of
Deferred
to reflect that (first, second) shouldn't be (trace, batch) but (input 1, input 2). The rest of the PR removes thework_result
wrapper closure by inlining its logic intoDeferred::work
.Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.