Skip to content

[multistage] Enable Tests for New Optimizer + Bug Fixes#15958

Merged
itschrispeck merged 3 commits intoapache:masterfrom
ankitsultana:mse-physical-tests
Jun 16, 2025
Merged

[multistage] Enable Tests for New Optimizer + Bug Fixes#15958
itschrispeck merged 3 commits intoapache:masterfrom
ankitsultana:mse-physical-tests

Conversation

@ankitsultana
Copy link
Contributor

@ankitsultana ankitsultana commented May 31, 2025

Summary

Adds the required fixes to make all tests in ResourceBasedQueriesTest pass with the new optimizer (both with and without lite mode).

Bugs Fixed / Gaps Addressed

  • Ensures Window groups are converted to RexLiteral. Logic is same as PinotWindowExchangeNodeInsertRule, and I have abstracted out the common code to a utils class. It's not a super clean approach code-style wise and I am open to feedback here.
  • Added support for As Of Join.
  • Fixed some trait assignment code.
  • Fixes AggregatePushdownRule to allow it to find Projects under Exchange. This was causing failure of some complex agg functions due to incorrect agg call generation.

Test Plan

Added E2E tests in ResourceBasedQueriesTest.

@ankitsultana ankitsultana added mse-physical-optimizer multi-stage Related to the multi-stage query engine labels May 31, 2025
@ankitsultana ankitsultana changed the title [do-not-merge] [multistage] Enable Tests for New Optimizer + Bug Fixes [multistage] Enable Tests for New Optimizer + Bug Fixes Jun 10, 2025
return funcSqlKind == SqlKind.OTHER_FUNCTION ? function.getOperator().getName() : funcSqlKind.name();
}

public static class WindowUtils {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: lifted as is from the window exchange rule

@ankitsultana ankitsultana marked this pull request as ready for review June 10, 2025 20:11
* NOTE: {@link Window} has a field called "constants" which contains the literal values. If the input reference is
* beyond the window input size, it is a reference to the constants.
*/
public static Window.Group updateLiteralArgumentsInWindowGroup(Window window) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more comments about how does RexInputRef(x) find the element in constant list with more details? That's what I get by using copilot.

1. Understanding the Inputs

When Calcite constructs a Window relational node, it conceptually sees the input as:

  • All columns from the input relation (e.g., your table columns)
  • Followed by any literal constants needed by window functions (these are placed in the constants list in the Window node)

So, if your input row has N columns, and your window function needs K constants, the total "input" to the Window node is N + K.


2. How RexInputRef Works in This Context

  • RexInputRef(i) refers to the i-th field in the Window node's input.
  • If i < N, it points to the i-th field of the original input row (e.g., a table column).
  • If i >= N, it points to the (i - N)-th entry in the constants list.

Example:

Suppose your table has 2 columns: employee_id, salary.
Suppose your window function is:

SQL
LAG(salary, 2, 0) OVER (PARTITION BY dept ORDER BY hire_date)
  • The constants needed are 2 and 0, so constants = [2, 0].

Input to the Window node:

employee_id | salary | 2 | 0 -- | -- | -- | -- ... | ... | 2 | 0

So:

  • RexInputRef(0)  employee_id
  • RexInputRef(1)  salary
  • RexInputRef(2)  constants[0] = 2
  • RexInputRef(3)  constants[1] = 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lifted as is from PinotWindowExchangeNodeInsertRule. GPT response is largely right: rex input ref is an integer reference to the input, but in window group it could exceed the field count of the input, in which case it is actually a reference to window.constants.

}
RexWindowBound upperBound = oldWindowGroup.upperBound;
offset = upperBound.getOffset();
if (offset != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment on when the offset is null; unbounded preceding / unbounded following / current row

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lifted as is from PinotWindowExchangeNodeInsertRule. I suppose offset is not null when functions like Lead and Lag are used.

&& !upperBound.isUnbounded());

if (!windowGroup.isRows) {
Preconditions.checkState(!hasOffset, "RANGE window frame with offset PRECEDING / FOLLOWING is not supported");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.pinot.apache.org/windows-functions The range frame clause with offset is claimed to be supported in the doc. Do I miss the content?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad.

@codecov-commenter
Copy link

codecov-commenter commented Jun 16, 2025

Codecov Report

❌ Patch coverage is 89.06250% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.18%. Comparing base (1a476de) to head (1a2ae28).
⚠️ Report is 1291 commits behind head on master.

Files with missing lines Patch % Lines
...apache/pinot/calcite/rel/rules/PinotRuleUtils.java 87.83% 5 Missing and 4 partials ⚠️
...y/planner/physical/v2/PRelToPlanNodeConverter.java 77.77% 0 Missing and 4 partials ⚠️
.../query/planner/physical/v2/RelToPRelConverter.java 90.90% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15958      +/-   ##
============================================
+ Coverage     62.90%   63.18%   +0.28%     
+ Complexity     1386     1355      -31     
============================================
  Files          2867     2950      +83     
  Lines        163354   169370    +6016     
  Branches      24952    25896     +944     
============================================
+ Hits         102755   107016    +4261     
- Misses        52847    54223    +1376     
- Partials       7752     8131     +379     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 ?
java-11 63.16% <89.06%> (+0.29%) ⬆️
java-21 63.15% <89.06%> (+0.33%) ⬆️
skip-bytebuffers-false ?
skip-bytebuffers-true ?
temurin 63.18% <89.06%> (+0.28%) ⬆️
unittests 63.18% <89.06%> (+0.28%) ⬆️
unittests1 56.33% <89.06%> (+0.51%) ⬆️
unittests2 33.29% <0.00%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@itschrispeck itschrispeck merged commit 7dbbc09 into apache:master Jun 16, 2025
32 of 36 checks passed
songwdfu pushed a commit to songwdfu/pinot that referenced this pull request Jun 18, 2025
* [multistage] Enable Tests for New Optimizer + Bug Fixes

* all tests pass

* fix checkstyle
@Jackie-Jiang
Copy link
Contributor

Seems the new added tests are flaky. @ankitsultana Can you please take a look?
Example run: https://github.com/apache/pinot/actions/runs/15924989688/job/44920115015?pr=16216

@ankitsultana
Copy link
Contributor Author

Looking into it and created #16223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-physical-optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants