-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fusing partial aggregation with repartition #12596
Comments
take |
I will push this forward after tasks listed in #11680 (comment) finished |
@waynexia may be also interested about this? |
I'm not clear on how this proposal works. Could you please explain why it provides performance benefits compared to partial aggregation, exchange, and final aggregation? Is the proposal aimed explicitly at accelerating high cardinality aggregation, or is it intended to enhance aggregation performance? |
I think it enhances aggregation performance generally?
After using partitioned approach in
|
I think our goal is to combine partial + repartition + final into single operator, and partial + repartition fusing is the first step of this. After that we could try doing final aggr step as well. |
Yes, it may be attractive if we combine them by someway, we seem to have chance to do more optimizations.
It seems to be expensive for bytes and string? Maybe we can pass the internal states directly to |
Is your feature request related to a problem or challenge?
I impl a poc #12526, and found this idea can actually improve performance.
But for some reasons stated in #11680 (comment)
I think this improvement is not so suitable to be pushed forward currently.
Just file an issue to track it.
Describe the solution you'd like
partial aggregation
, and we partition the datafusion before inserting them into hashtable.final aggregation
partition by partition after, rather than split them again inrepartition
, and merge them again incoalesce
.Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: