Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify JoinToTimeSpineNode #1542

Open
wants to merge 5 commits into
base: court/simp3
Choose a base branch
from
Open

Conversation

courtneyholcomb
Copy link
Contributor

@courtneyholcomb courtneyholcomb commented Nov 21, 2024

A bunch of cleanup in the dataflow to SQL logic for the JoinToTimeSpineNode. There was a lot of duplication and hard to read code here.

Notes:

  • Please review by commit. The more complex commits have extended commit descriptions to help explain the changes.
  • There are many SQL changes here but none of them functional changes. The changes to optimized snapshots might be best to focus on.

@@ -322,54 +323,54 @@ def _make_time_spine_data_set(
"""
time_spine_table_alias = self._next_unique_table_alias()

agg_time_dimension_specs = {instance.spec for instance in agg_time_dimension_instances}
queried_specs = {instance.spec for instance in agg_time_dimension_instances}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to differentiate more clearly from required_specs.

@courtneyholcomb courtneyholcomb force-pushed the court/simp4 branch 2 times, most recently from d0996f3 to b4e4e9c Compare November 21, 2024 15:28
@@ -19,8 +19,8 @@ FROM (
FROM (
-- Join to Time Spine Dataset
SELECT
DATE_TRUNC('month', subq_12.ds) AS metric_time__month
, subq_10.metric_time__day AS metric_time__day
subq_12.ds AS metric_time__day
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can see we are now pulling the join column from the time spine, where in the last PR we were not.

The biggest change in this commit is to remove all logic related to building select columns from visit_join_to_time_spine_node(). This logic was duplicated in _make_time_spine_data_set().
That logic is now consolidated to _make_time_spine_data_set(). This is laying the foundation for a change coming up the stack, which will replace _make_time_spine_data_set() with its own
node in the DataflowPlan. There is also some other cleanup in this commit because to make visit_join_to_time_spine_node() simpler and more readable.
This sorting was not commprehensive and resulted in inconsistent snapshots. Removed the custom sorting here and retained the original order, instead.
@courtneyholcomb courtneyholcomb marked this pull request as ready for review November 21, 2024 15:40
@courtneyholcomb courtneyholcomb added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Nov 21, 2024
@github-actions github-actions bot removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant