Skip to content

refactor: introduce pinot.broker.multistage.sort.exchange.copy.threshold for sort-exchange copy transformation#17328

Merged
gortiz merged 10 commits intoapache:masterfrom
gortiz:feature/partial-12237-revert
Dec 23, 2025
Merged

refactor: introduce pinot.broker.multistage.sort.exchange.copy.threshold for sort-exchange copy transformation#17328
gortiz merged 10 commits intoapache:masterfrom
gortiz:feature/partial-12237-revert

Conversation

@gortiz
Copy link
Contributor

@gortiz gortiz commented Dec 5, 2025

This PR adds a query option and a broker config to customize the threshold used in the optimization introduced in #12237.

This optimization affects queries like:

SELECT whatever
FROM table
ORDER BY col
LIMIT x

MSE blindly adds an exchange here (I think sometimes it is not needed, but that is not the point of this PR). We need to add a sort-with-limit in the receiver stage to preserve semantics, and we can also add one in the sender stage to avoid sending extra blocks and, eventually, enable k-merge sort.

The PR #12237 modifies the PinotSortExchangeCopyRule rule only to add the sort-with-limit when:

  • There is a limit
  • The limit is smaller than 10k rows

The first is justified by the fact that we don't use k-merge sort, and therefore, the sending side sort is useless.
I don't understand the reason why the <10k rows condition was added. We have found cases where huge limits are used by not adding this filter, we end up sending too many rows that we don't actually need.

Initially, this PR removed the second condition, but in order to be backward compatible, I changed the code to include:

  • A new query option called pinot.broker.multistage.sort.exchange.copy.threshold, which can be used to change the default threshold for all queries in the broker.
  • The query option sortExchangeCopyThreshold, which can be used to change this threshold for specific queries.

The change includes two tests that verify the query option can be used to increase the threshold and also to reduce it. When I created these tests I found that we are not using the default QueryEnvironment in the tests that derive from QueryEnvironmentTestBase.java. This is because they called the default constructor of QueryEnvironment instead of using the one created in MultiStageBrokerRequestHandler. In order to fix that, this PR creates a QueryEnvironmentFactory that contains the code of MultiStageBrokerRequestHandler.getQueryEnvConf.

Notice that this PR only modifies the tests that derive from ResourceBasedQueryPlansTest.testQueryExplainPlansAndQueryPlanConversion. There are other tests that should be upgraded as well, but they are not related to this PR

@codecov-commenter
Copy link

codecov-commenter commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 86.66667% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.27%. Comparing base (227de91) to head (2df1f79).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...requesthandler/MultiStageBrokerRequestHandler.java 0.00% 2 Missing ⚠️
...e/pinot/common/utils/config/QueryOptionsUtils.java 66.66% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17328      +/-   ##
============================================
+ Coverage     63.23%   63.27%   +0.04%     
  Complexity     1474     1474              
============================================
  Files          3152     3153       +1     
  Lines        187870   187914      +44     
  Branches      28762    28762              
============================================
+ Hits         118793   118896     +103     
+ Misses        59861    59805      -56     
+ Partials       9216     9213       -3     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.22% <86.66%> (+0.01%) ⬆️
java-21 63.24% <86.66%> (+0.04%) ⬆️
temurin 63.27% <86.66%> (+0.04%) ⬆️
unittests 63.26% <86.66%> (+0.04%) ⬆️
unittests1 55.69% <92.85%> (+0.02%) ⬆️
unittests2 33.91% <56.66%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you meant to link to #12237 in the PR description?

The first is justified on the fact that we don't use k-merge sort, and therefore, the sending side sort is useless.
I don't understand the reason why the <10k rows condition was added.

I don't know the history here either, but maybe the reasoning was that beyond some arbitrary threshold (determined heuristically probably), the benefit of reducing the number of rows being sent and globally sorted is outweighed by the cost of duplicated sorting?

@gortiz
Copy link
Contributor Author

gortiz commented Dec 9, 2025

I guess you meant to link to #12237 in the PR description?

You are right. Updated

@gortiz gortiz force-pushed the feature/partial-12237-revert branch from 60983e4 to 3b9062b Compare December 10, 2025 15:13
@gortiz
Copy link
Contributor Author

gortiz commented Dec 10, 2025

@yashmayya I modified this PR. Can you review it again?

@gortiz gortiz force-pushed the feature/partial-12237-revert branch from 3b9062b to 929a4ac Compare December 10, 2025 16:16
Comment on lines -32 to -34
* NOTE: this file was generated using Calcite's code generator, but instead of pulling in all
* the dependencies for codegen we just manually generate it and check it in. If active development
* on this needs to happen, re-generate it using Calcite's generator.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule is not need it anymore. I don't know why it was decided to not use the generator, but we are actually using in other rules, so doesn't make much sense to have this generated class here

@gortiz gortiz force-pushed the feature/partial-12237-revert branch from 929a4ac to 69b04f2 Compare December 10, 2025 16:19
private ImmutableQueryEnvironment.Config getQueryEnvConf(HttpHeaders httpHeaders, Map<String, String> queryOptions,
long requestId) {
String database = DatabaseUtils.extractDatabaseFromQueryRequest(queryOptions, httpHeaders);
boolean inferPartitionHint = _config.getProperty(CommonConstants.Broker.CONFIG_OF_INFER_PARTITION_HINT,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moved to pinot-query-planner/src/main/java/org/apache/pinot/query/QueryEnvironmentConfigFactory.java

@gortiz gortiz force-pushed the feature/partial-12237-revert branch from 69b04f2 to cecedaf Compare December 10, 2025 16:25
@yashmayya yashmayya added enhancement multi-stage Related to the multi-stage query engine labels Dec 10, 2025
Copy link
Contributor

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a compilation issue in here.

  • A new query option called pinot.broker.multistage.sort.exchange.copy.threshold, which can be used to change the default threshold for all queries in the broker.
  • The query option sortExchangeCopyThreshold, which can be used to change this threshold for specific queries.

What is our official recommendation going to be for tuning this threshold?

@gortiz
Copy link
Contributor Author

gortiz commented Dec 15, 2025

What is our official recommendation going to be for tuning this threshold?

I think until we have k-merge implemented, we should disable this optimization. Whether we want to disable it by default or not is something we should discuss

@gortiz
Copy link
Contributor Author

gortiz commented Dec 15, 2025

The compilation error should be fixed now


/// A factory class to create QueryEnvironment.Config instances.
///
/// This includes the ability to cus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: incomplete

public static ImmutableQueryEnvironment.Config create(
String database, Map<String, String> queryOptions,
long requestId, PinotConfiguration config, TableCache tableCache, WorkerManager workerManager,
@Nullable Set<String> defaultDisabledPlannerRules) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be nullable?

@gortiz gortiz force-pushed the feature/partial-12237-revert branch from f3b87f6 to cb951ed Compare December 22, 2025 13:38
return uncheckedParseInt(QueryOptionKey.REGEX_DICT_SIZE_THRESHOLD, regexDictSizeThreshold);
}

public static int getSortExchangeCopyThreshold(Map<String, String> options, int i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i -> defaultSortExchangeCopyThreshold?

@gortiz gortiz merged commit 6d4df46 into apache:master Dec 23, 2025
17 of 18 checks passed
@gortiz gortiz deleted the feature/partial-12237-revert branch December 23, 2025 10:13
@gortiz gortiz changed the title refactor: remove threshold for sort-exchange copy transformation refactor: introduce pinot.broker.multistage.sort.exchange.copy.threshold for sort-exchange copy transformation Feb 17, 2026
@gortiz gortiz added the release-notes Referenced by PRs that need attention when compiling the next release notes label Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement multi-stage Related to the multi-stage query engine release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants