-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify dataflow to SQL logic for JoinOverTimeRangeNode
#1540
base: court/simp1
Are you sure you want to change the base?
Conversation
There should be no functional changes in this commit, only cleanup and readability improvements. Mostly involves moving complex logic to helper functions.
|
||
|
||
@dataclass(frozen=True) | ||
class AnnotatedSqlDataSet: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class should be unchanged. I needed to move it to resolve circular imports.
@@ -1390,6 +1361,21 @@ def visit_semi_additive_join_node(self, node: SemiAdditiveJoinNode) -> SqlDataSe | |||
), | |||
) | |||
|
|||
def _choose_instance_for_time_spine_join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helper function is only used once at this point, but it will be used again for the JoinToTimeSpineNode
farther up the stack.
@@ -803,6 +803,7 @@ def transform(self, instance_set: InstanceSet) -> SelectColumnSet: # noqa: D102 | |||
) | |||
|
|||
|
|||
# TODO: delete this class & all uses. It doesn't do anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is supposed to change the column names, but if the specs didn't change, then the column names shouldn't either, so it seems like it doesn't do anything. LMK if I'm overlooking something here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be needed but maybe things have changed. There were cases where nodes did not output data sets where the column name in the generated SQL did not match the defined format. Trying to remember what that was though, but there was a bug fix that required this transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, but LGTM.
"No appropriate agg_time_dimension was found to join to the time spine. " | ||
"This indicates that the dataflow plan was configured incorrectly." | ||
) | ||
agg_time_dimension_instances.sort(key=lambda instance: instance.spec.time_granularity.base_granularity.to_int()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but we probably should just make the TimeGranularity
enum orderable in DSI at some point since this operation has come up a few times.
@@ -803,6 +803,7 @@ def transform(self, instance_set: InstanceSet) -> SelectColumnSet: # noqa: D102 | |||
) | |||
|
|||
|
|||
# TODO: delete this class & all uses. It doesn't do anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be needed but maybe things have changed. There were cases where nodes did not output data sets where the column name in the generated SQL did not match the defined format. Trying to remember what that was though, but there was a bug fix that required this transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to approve earlier.
There should be no functional changes in this commit, only cleanup and readability improvements. Mostly involves moving complex logic to helper functions.