Skip to content

Conversation

qianheng-aws
Copy link
Collaborator

@qianheng-aws qianheng-aws commented Sep 30, 2025

Description

Fallback to sub-aggregation if composite aggregation doesn't support(e.g. auto_span).

As sub-aggregation includes term agg which doesn't include null bucket, so we only fallback to push the aggregation when bucket_nullable=false. Otherwise, the aggregation will fail to push down and executes as a Calcite's Enumerable aggregate operator.

This PR also refact and scale the optimization #4337 for bucket aggregation including single bucket and nested bucket(i.e. sub-agg).

Related Issues

Resolves #4338

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Heng Qian <qianheng@amazon.com>
yuancu
yuancu previously approved these changes Sep 30, 2025
# Conflicts:
#	integ-test/src/test/resources/expectedOutput/calcite/explain_agg_sort_on_metrics2.json
#	integ-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable1.json
#	integ-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown_bucket_nullable2.json
Signed-off-by: Heng Qian <qianheng@amazon.com>
yuancu
yuancu previously approved these changes Oct 9, 2025
Comment on lines +44 to +47
final LogicalAggregate aggregate = call.rel(0);
final LogicalFilter filter = call.rel(1);
final LogicalProject project = call.rel(2);
final CalciteLogicalIndexScan scan = call.rel(3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        LogicalAggregate
          LogicalProject
            LogicalFilter(condition=[IS NOT NULL($7)])

Shouldn't be following?

      final LogicalAggregate aggregate = call.rel(0);
      final LogicalProject project = call.rel(1);
      final LogicalFilter filter = call.rel(2);
      final CalciteLogicalIndexScan scan = call.rel(3);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original plan is

LogicalAggregate
      LogicalProject
            LogicalFilter(condition=[IS NOT NULL($7)])
                LogicalProject
                  LogicalIndexScan

ProjectFilterTransposeRule and ProjectMergeRule will produce agg-filter-project-scan pattern.

Maybe FilterProjectTransposeRule + ProjectMergeRule may produce agg-project-filter-scan

Copy link
Collaborator Author

@qianheng-aws qianheng-aws Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern agg-filter-project-scan will make it easier to detect whether the filter is derived from the agg. The other pattern could work as well while we need to do additional transformation since the ref of field has passed through a project operator.

Config BUCKET_NON_NULL_AGG =
ImmutableOpenSearchAggregateIndexScanRule.Config.builder()
.build()
.withDescription("Agg-Filter-Project-TableScan")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not "Agg-Project-Filter-TableScan"? I think Agg-Project should be treated as a single unit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But filter(isnotnull) here is also derived from the agg.

return Pair.of(
ImmutableList.copyOf(newMetricBuilder.getAggregatorFactories()),
new CountAsTotalHitsParser(countAggNameList));
List.of(), new CountAsTotalHitsParser(countAggNameAndBuilderPair.getLeft()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you create a named var for countAggNameAndBuilderPair.getLeft()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*
* @param metricBuilder metrics builder
* @param countAllOnly remove count() only, or count(FIELD) will be removed.
* @return a pair of removed ValueCountAggregationBuilder and updated metric builder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc need to update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

public static class CompositeUnSupportedException extends RuntimeException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move it to package org.opensearch.sql.exception?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception only happens here and will be caught internally. It doesn't need to be accessed outside of this file

Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
@yuancu yuancu merged commit abd5bd3 into opensearch-project:main Oct 10, 2025
33 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4413-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 abd5bd3337af63892ff8ee762ad3c5902fa87b2a
# Push it to GitHub
git push --set-upstream origin backport/backport-4413-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4413-to-2.19-dev.

qianheng-aws added a commit to qianheng-aws/sql that referenced this pull request Oct 10, 2025
…pensearch-project#4413)

* Fallback to sub-aggregation if composite aggregation doesn't support

Signed-off-by: Heng Qian <qianheng@amazon.com>

* merging main

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

(cherry picked from commit abd5bd3)
Signed-off-by: Heng Qian <qianheng@amazon.com>
LantaoJin pushed a commit that referenced this pull request Oct 10, 2025
…on doesn't support #4413 (#4498)

* Fallback to sub-aggregation if composite aggregation doesn't support (#4413)

* Fallback to sub-aggregation if composite aggregation doesn't support

Signed-off-by: Heng Qian <qianheng@amazon.com>

* merging main

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

(cherry picked from commit abd5bd3)
Signed-off-by: Heng Qian <qianheng@amazon.com>

* fix compiling

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Oct 10, 2025
asifabashar pushed a commit to asifabashar/sql that referenced this pull request Oct 16, 2025
…pensearch-project#4413)

* Fallback to sub-aggregation if composite aggregation doesn't support

Signed-off-by: Heng Qian <qianheng@amazon.com>

* merging main

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Asif Bashar <asif.bashar@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aggregation backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated enhancement New feature or request PPL Piped processing language pushdown pushdown related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Support nested sub-aggregation parser

3 participants