-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC-0005. Implementation Scalar function stats propagation, Phase 1 #23545
base: master
Are you sure you want to change the base?
Conversation
1504ced
to
c358d95
Compare
|
Thank you for the comment @steveburnett , do you have a suggestion on documenting this feature in full detail e.g. developer docs. |
de5982e
to
49a20e0
Compare
49a20e0
to
495734f
Compare
Hi @ZacBlanco , can you please take a look! |
Yes, thanks! If these properties are relevant to Presto, please add entries for them to https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties.rst. If these are Presto C++ (Prestissimo), please add the entries to https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/features.rst#session-properties and https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties.rst as appropriate. |
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/scalar/ScalarHeader.java
Outdated
Show resolved
Hide resolved
...c/main/java/com/facebook/presto/operator/scalar/annotations/ScalarFromAnnotationsParser.java
Outdated
Show resolved
Hide resolved
...c/main/java/com/facebook/presto/operator/scalar/annotations/ScalarFromAnnotationsParser.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/function/Signature.java
Outdated
Show resolved
Hide resolved
babde3b
to
2ee5cd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding the documentation! A few nits of formatting, and a couple of revisions for consistency with the existing documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick response! Two final nits that I should have caught in my first pass, and I think this should be everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull updated branch, reviewed new local doc build, looks good. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit of formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull updated branch, new local doc build, looks good. Thanks!
9e1a403
to
ad6e77d
Compare
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
ad6e77d
to
a9f04a8
Compare
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
a9f04a8
to
99302d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor things, otherwise lgtm
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
...c/main/java/com/facebook/presto/operator/scalar/annotations/ScalarFromAnnotationsParser.java
Show resolved
Hide resolved
99302d3
to
9d77026
Compare
Hi @elharo is it good to go? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sold on using doubles for row counts and the like. Doubles are not just a bigger long and have round off and representation problems integers don't. If these things are really going to exceed the size of a long (which I'm not sure they do) then a BigInteger would be preferred.
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Show resolved
Hide resolved
BigInteger does not exist in java natively. May be this is a decision that presto project has made much before this PR came in. Will it be a good idea, if you can start a mailing group thread with broader audience? |
BigInteger isn't a primitive type, but java.math.BigInteger is available if you need it. Again, I'm not convinced you do. long should work here. |
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/cost/TestScalarStatsCalculator.java
Show resolved
Hide resolved
e248c01
to
40b6f47
Compare
Rebased with master |
9551c16
to
7693879
Compare
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/ScalarStatsAnnotationProcessor.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/function/ScalarStatsHeader.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/function/ScalarStatsHeader.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/function/ScalarFunctionConstantStats.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/function/ScalarFunctionConstantStats.java
Outdated
Show resolved
Hide resolved
...-main/src/main/java/com/facebook/presto/metadata/BuiltInTypeAndFunctionNamespaceManager.java
Outdated
Show resolved
Hide resolved
1. Support for annotating functions with both constant stats and propagating source stats. 2. Added tests for the same. 3. Added Scalar stats calculation based on annotation and tests for the same. Not added SQLInvokedScalarFunctions. Not annotated builtin functions, as that is covered in next implementation phase. Not added C++ changes as this phase only covers Java side of changes. Added documentation for the new properties and ... 1. Previously, if any of the source stats were missing, we would still compute the max/min/sum of argument stats etc.. now we propagate NaNs if any one of the arguments' stats are missing. 2. For distinct values count, upper bounding it to row count is as good as unknown. Therefore, the approach here is, when distinctValuesCount is greater than row count and is provided via annotation we set it to unknown. A function developer has full control here, for example developer can choose to upper bound or not by selecting the appropriate StatsPropagationBehavior value. 3. For average row size, a) If average row size is provided via ScalarFunctionConstantStats annotation, then we allow even if the size is greater than functions return type width. b) If average row size is provided via one of the StatsPropagationBehavior values, then we upper bound it to functions return type width - if available. If both (a) and (b) is unknown, then we default it to functions return type width if available. This way the function developer has greater control. Added new behaviour SUM_ARGUMENTS_UPPER_BOUND_ROW_COUNT which would upper bound the values to row count, so that summing distinct values count not exceed row counts.
c334fdb
to
07d2d0c
Compare
Description
Not added SQLInvokedScalarFunctions.
Not annotated builtin functions, as that is covered in next implementation phase. Not added C++ changes as this phase only covers Java side of changes.
Motivation and Context
https://github.com/prestodb/rfcs/blob/main/RFC-0005-functions-stats.md
Impact
None unless the user chooses to enable the feature via setting the session/feature flag.
A new session flag,
scalar_function_stats_propagation_enabled
and a new feature config will be introduced i.e.optimizer.scalar-function-stats-propagation-enabled
, by setting this session flag or feature flag, this feature can be turned on or off.Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.