[multistage] Multiple Window Group Support#16109
Merged
ankitsultana merged 5 commits intoapache:masterfrom Jul 8, 2025
Merged
Conversation
53efb75 to
f9bb276
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #16109 +/- ##
============================================
+ Coverage 62.90% 63.29% +0.39%
+ Complexity 1386 1365 -21
============================================
Files 2867 2960 +93
Lines 163354 170635 +7281
Branches 24952 26103 +1151
============================================
+ Hits 102755 108008 +5253
- Misses 52847 54431 +1584
- Partials 7752 8196 +444
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f9bb276 to
f882733
Compare
f882733 to
0301b38
Compare
| .map(operand -> operand.accept(this)) | ||
| .collect(Collectors.toList()); | ||
| return new Window.RexWinAggCall( | ||
| (SqlAggFunction) winCall.getOperator(), |
Collaborator
There was a problem hiding this comment.
nit: Can we have a Preconditions.checkState for clear error handling if the cast is not correct?
Contributor
Author
There was a problem hiding this comment.
The RexWinAggCall currently can only be initialized with SqlAggFunction and the condition to step into this block is that the if (call instanceof Window.RexWinAggCall)
public static class RexWinAggCall extends RexCall {
public final int ordinal;
public final boolean distinct;
public final boolean ignoreNulls;
/** @deprecated */
@Deprecated
public RexWinAggCall(SqlAggFunction aggFun, RelDataType type, List<RexNode> operands, int ordinal, boolean distinct) {
this(aggFun, type, operands, ordinal, distinct, false);
}
public RexWinAggCall(SqlAggFunction aggFun, RelDataType type, List<RexNode> operands, int ordinal, boolean distinct, boolean ignoreNulls) {
super(type, aggFun, operands);
this.ordinal = ordinal;
this.distinct = distinct;
this.ignoreNulls = ignoreNulls;
}
}
23ec998 to
f44a2e9
Compare
pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotWindowSplitRule.java
Show resolved
Hide resolved
pinot-query-planner/src/test/resources/queries/PhysicalOptimizerPlans.json
Show resolved
Hide resolved
pinot-query-planner/src/test/resources/queries/PhysicalOptimizerPlans.json
Outdated
Show resolved
Hide resolved
ankitsultana
approved these changes
Jul 8, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolve the issue #15985
Background
The Core Problem: Shifting Input Field Counts. A LogicalWindow's expressions (
RexNodes) can refer to two things via RexInputRef: (1) Columns from its input relation (index < inputFieldCount); (2) Literals from its constants list (index >= inputFieldCount).When we chain the windows, the inputFieldCount for each subsequent window increases.
k-th constant is RexInputRef(N + k).N. It addsagg1_countfields. Its output hasN + agg1_countfields.N' = N + agg1_count.If a group in LW2 keep using the original RexInputRef(
N + k), LW2 will interpret it as a reference to the (N + k)-th column of its input. This is now incorrect. It's pointing to one of LW1's aggregate results instead of the intended constant. The reference needs to be shifted to RexInputRef(N' + k).Solution
Correct Row Type construction
cumulativeAggFieldCountvariable correctly tracks our position in the original aggregate field list.Correct RexInputRef pointing to the constant variable
Use a RexShuttle to traverse the expressions in each Window.Group and adjust any RexInputRef that points to a constant. Specially,
visitInputRefis overridden. It checks if an input reference's index points to a constant (index >=originalInputFieldCount). If it does, it creates a new RexInputRef with the index increased by theshiftamount. Otherwise, it returns the original reference. Theshiftamount denotes the number of new fields added by previous windows in the chain.Why groupSet and orderKeys Don't Need Shifting
groupSetandorderKeysRefer To: The integer indices in groupSet (PARTITION BY) and orderKeys (ORDER BY) refer to the columns of the window's direct input relation. Calcite provides mechanisms to fail early or eliminate the constant variables appearing in the window functions'sPartition ByorOrder By. Therefore, these indices will always be < inputFieldCount and will never point to the constants list.Outcome
Dataset: the Quick Start of COLOCATED_JOIN
Test SQL:
Removing
SET usePhysicalOptimizer=true;get the same correct result.Result