-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CTEs in sub-query reducer #1504
Conversation
@@ -196,7 +199,7 @@ def _is_simple_source(node: SqlSelectStatementNode) -> bool: | |||
if select_column.expr.lineage.contains_aggregate_exprs: | |||
return False | |||
return ( | |||
len(node.parent_nodes) <= 1 | |||
len(node.join_descs) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a behavior change - were there changes to the snapshots from this change alone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were no snapshot changes. This is not a behavior change as without CTEs, the number of parent nodes is 1 + the number of joins.
@@ -612,10 +615,11 @@ def visit_select_statement_node(self, node: SqlSelectStatementNode) -> SqlQueryP | |||
# JOIN dim_listings c | |||
# ON a.listing_id = b.listing_id | |||
|
|||
assert len(node_with_reduced_parents.parent_nodes) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to remove this assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - since CTEs are modeled as a parent node of a SELECT node, this assertion does not hold.
@@ -579,6 +578,13 @@ def _rewrite_node_with_join(node: SqlSelectStatementNode) -> SqlSelectStatementN | |||
select_columns=tuple(clauses_to_rewrite.select_columns), | |||
from_source=from_source, | |||
from_source_alias=from_source_alias, | |||
cte_sources=tuple( | |||
SqlCteNode.create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use cte_source.with_new_select
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. with_new_select
got added as a convenience method late in the stack.
Previously, "parent nodes" was used for either nodes in the FROM clause or the JOIN clause. With the addition of CTEs that are also considered parent nodes, this renames a few variables / updates some conditionals for clarity.
This updates the sub-query reducer to support CTEs. The
SELECT
statement in each CTE is treated like aSELECT
, and all CTEs are retained. i.e. a CTE is not reduced with another CTE. The CTEs are retained to make mapping to dataflow plan nodes easier in later PRs.