Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support n-ary monotonic functions in ordering equivalence #13841

Merged
merged 22 commits into from
Dec 20, 2024

Conversation

gokselk
Copy link
Contributor

@gokselk gokselk commented Dec 19, 2024

Which issue does this PR close?

Closes #13839.

Rationale for this change

DataFusion's ordering equivalence system currently only handles single-parameter monotonic functions. This PR extends support to N-ary functions that preserve lexicographical ordering, like concat(). This enables more optimization opportunities when working with ordered data.

What changes are included in this PR?

  • Added preserves_lex_ordering field to ExprProperties
  • Implemented output_preserves_lex_ordering for concat function
  • Updated discover_new_orderings() to handle multiple parameters
  • Added tests for concat and non-monotonic multiply cases

Are these changes tested?

Yes, added three new test cases:

  • test_ordering_equivalence_with_monotonic_concat
  • test_ordering_equivalence_with_non_monotonic_multiply
  • test_ordering_equivalence_with_concat_equality

Are there any user-facing changes?

No user-facing changes. This is an internal optimization improvement.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions functions labels Dec 19, 2024
@gokselk
Copy link
Contributor Author

gokselk commented Dec 19, 2024

cc: @berkaysynnada @ozankabak

@gokselk
Copy link
Contributor Author

gokselk commented Dec 19, 2024

Note about Cargo.toml changes

I added datafusion-functions as a dependency to datafusion/physical-expr/Cargo.toml to use the concat function in tests.

@gokselk gokselk changed the title Support N-Ary Monotonic Functions in Ordering Equivalence Support n-ary monotonic functions in ordering equivalence Dec 19, 2024
@@ -44,6 +44,7 @@ arrow-schema = { workspace = true }
datafusion-common = { workspace = true, default-features = true }
datafusion-expr = { workspace = true }
datafusion-expr-common = { workspace = true }
datafusion-functions = { workspace = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since only test requires this, maybe move to dev-dependencies

Copy link
Contributor

@jayzhan211 jayzhan211 Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe move the test in datafusion/core/tests 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved it to dev-dependencies for now

Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a naming comment

datafusion/expr/src/udf.rs Outdated Show resolved Hide resolved
datafusion/expr/src/udf.rs Outdated Show resolved Hide resolved
@berkaysynnada
Copy link
Contributor

will be merged once CI passes one more time

@berkaysynnada berkaysynnada merged commit 95d296c into apache:main Dec 20, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support n-ary monotonic functions in ordering equivalence
4 participants