Fix PartialAggregationOptimizer for multi-partition plans#20719
Fix PartialAggregationOptimizer for multi-partition plans#20719bharath-techie merged 3 commits intoopensearch-project:feature/datafusionfrom
Conversation
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
PR Reviewer Guide 🔍(Review updated until commit 8251612)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 8251612
Previous suggestionsSuggestions up to commit 06e8379
|
|
❌ Gradle check result for 06e8379: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 306796f.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit 8251612 |
47e7fa2
into
opensearch-project:feature/datafusion
|
❌ Gradle check result for 8251612: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
Use partial for all aggs - not just approx distinct to make the plan more simple and consistent for all aggs(Keeping back dc check for now)Problem:
With multi-partition setups, DataFusion generates a two-level aggregation plan:
The old optimizer converted the
FinalPartitionedaggregate toPartialmode, creating two stackedPartialaggregates. The top one received intermediate state (e.g. HLL binary registers) from the bottomPartialand tried to interpret it as raw input →Cast error: Casting from Binary to Int64 not supported.Single-partition plans worked because DataFusion skips the repartition/Final layer entirely.
Fix:
Instead of converting
FinalPartitionedtoPartial, simply remove theFinalPartitionednode and return its child. This preserves the repartition/coalesce layers for correct group-by locations while keeping the output in intermediate (partial) form. The Java side already handles merging partial results for all aggs.Resulting plan:
Also generalized the optimizer to apply to all aggregation types (not justapprox_distinct), since partial and single mode produce equivalent intermediate state for all aggregates, and the Java side merges all of them.Existing plan:
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.