Skip to content

Conversation

songkant-aws
Copy link
Contributor

@songkant-aws songkant-aws commented Aug 25, 2025

Description

Implement Append command in V3 Calcite engine. This Append command is quite similar to standard SQL union all operator.

Related Issues

Resolves #4078

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test in ExplainIT and CrossClusterSearchIT?


private final UnresolvedPlan subSearch;

private UnresolvedPlan child;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the different between searchPlan and child?

Copy link
Contributor Author

@songkant-aws songkant-aws Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The child is the main query. However, the searchPlan is the first optional searchCommand node starting in subsearch.

This is for matching optional searchCommand in the head of subsearch for append or appendcols command.

For example, in future, this query is a valid query: search | append [ | inputlookup myexcel | fields myfield ]
In above query case, the searchCommand between square brackets is optional because inputlookup command is a command similar to TableFunction. Users have freedom to choose whether data comes from search or local input. The searchPlan is used for parsing empty searchCommand as a 0 row * 0 col LogicalValues[[]] RelNode.

Another reason is to use searchPlan to handle an edge case of appending empty subresults. Some other pipeline language has similar functionality to allow subsearch outputs empty result. For example, search | append [ ] and search | append [ | fields a, b, c ] are the same because subsearch start with 0 row * 0 col input. The syntax is legit to append empty result. It is equivalent to main query.

Calcite is a strong schema engine, parsing a RelNode like

Project(a = [$0], b = [$1], ..)
    LogicalValues[[]]

will throw exception when either building RelNode or at runtime. Because Project cannot find any 'a' column from the input.

I'm open to this discussion, we have two options here:

  1. Since we're using strong schema engine, we should just throw exception. This option is simple but may have inconsistent behavior with other pipeline language.
  2. We add logics to use empty values like LogicalValues[[]] to handle edge cases. A better way is probably to determine whether subsearch RelNode's leaf is empty values. But still it has the problem of building RelNode successfully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactor the code a bit to avoid confusing name of searchPlan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I choose to throw exception for empty subsearch input for now because it needs some dirty work to preprocess ast tree to achieve a not useful use case. Not supporting it by throwing exception is elegant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I choose to throw exception for empty subsearch input for now because it needs some dirty work to preprocess ast tree to achieve a not useful use case. Not supporting it by throwing exception is elegant.

@songkant-aws I think the the previous implementation looks good to me except the confusing name of searchPlan, maybe we need it back with another name such as emptySource

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed back the empty source support, and for nested case.

return selectedColumns;
}

static boolean isEmptyValuesPlan(UnresolvedPlan plan) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PlanUtils is for Calcite RelNode actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in handling edge case of determining RelNode in subsearch as described in #4123 (comment). But I'm thinking it can be moved to AstBuilder as well.

Comment on lines 1187 to 1188
if (PlanUtils.isEmptyValuesPlan(node.getSearchPlan())) {
node.getSearchPlan().accept(this, context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain what is empty values? I didn't see any IT for this.
IMO, the searchPlan is child of the node and you have already visited its children in step 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my explanation in: #4123 (comment)

@Test
public void testAppendEmptySearchCommand() {
List<String> testPPLs =
Arrays.asList("source=EMP | append [ | where DEPTNO = 20 ]", "source=EMP | append [ ]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting append [ ] is on purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's on purpose to test the edge case of appending 0 row * 0 col subresult. See my explanation in: #4123 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it, please support the corner case append [ ] and append [ | fields a,b,c ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>

// TODO: Revisit lookup logic here but for now we don't see use case yet
@Override
public UnresolvedPlan visitLookup(Lookup node, Void context) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of tests/ITs cover the visitLookup and visitJoin you added here. Please add tests/ITs for empty source for lookup/join

Copy link
Contributor Author

@songkant-aws songkant-aws Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests. For now, our join or lookup command syntax enforces the searchCommand to be existent in the right child. So I can't cover the case of empty source of right child. Empty source for left child test cases are added.

"user/ppl/cmd/showdatasources.rst",
"user/ppl/cmd/information_schema.rst",
"user/ppl/cmd/eval.rst",
"user/ppl/cmd/fillnull.rst",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated to L63

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed duplicate line

…subsearch

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
@LantaoJin
Copy link
Member

@songkant-aws please fix the conflicts. @qianheng-aws @yuancu do one of you have a chance to take another review after conflicts resolved?

@yuancu yuancu merged commit 9cd1f96 into opensearch-project:main Sep 5, 2025
31 of 32 checks passed
songkant-aws added a commit to songkant-aws/sql that referenced this pull request Sep 8, 2025
* Implement Append Command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Rephrase append.rst

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support subsearch different index for append command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix some tests and add cross cluster IT

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Not support empty subsearch input for now

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix doctest

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support empty source edge case

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix anonymizer tests

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add missing test cases for nested join or lookup command in appended subsearch

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix compile issue

Signed-off-by: Songkan Tang <songkant@amazon.com>

---------

Signed-off-by: Songkan Tang <songkant@amazon.com>
LantaoJin pushed a commit that referenced this pull request Sep 10, 2025
* Implement Append Command



* Fix spotless check



* Rephrase append.rst



* Support subsearch different index for append command



* Fix some tests and add cross cluster IT



* Not support empty subsearch input for now



* Fix doctest



* Support empty source edge case



* Fix anonymizer tests



* Add missing test cases for nested join or lookup command in appended subsearch



* Fix compile issue



---------

Signed-off-by: Songkan Tang <songkant@amazon.com>
joshuali925 pushed a commit that referenced this pull request Sep 16, 2025
* Implement Append Command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Rephrase append.rst

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support subsearch different index for append command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix some tests and add cross cluster IT

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Not support empty subsearch input for now

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix doctest

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support empty source edge case

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix anonymizer tests

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add missing test cases for nested join or lookup command in appended subsearch

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix compile issue

Signed-off-by: Songkan Tang <songkant@amazon.com>

---------

Signed-off-by: Songkan Tang <songkant@amazon.com>
joshuali925 pushed a commit that referenced this pull request Sep 24, 2025
* Doc enhancement for eventstats and bin command (#4117)

* distinct_count doc for eventstats

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* doc enhancement

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* add fields for consistency between different Java versions

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* remove changes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* add bin to index.rst

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* add link

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Implement `Append` command with Calcite (#4123)

* Implement Append Command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix spotless check

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Rephrase append.rst

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support subsearch different index for append command

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix some tests and add cross cluster IT

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Not support empty subsearch input for now

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix doctest

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Support empty source edge case

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix anonymizer tests

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Add missing test cases for nested join or lookup command in appended subsearch

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Fix compile issue

Signed-off-by: Songkan Tang <songkant@amazon.com>

---------

Signed-off-by: Songkan Tang <songkant@amazon.com>

* `Bin` command big5 queries (#4163)

* Bin command big5 queries

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update IT

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* remove tests

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <105710027+ahkcs@users.noreply.github.com>

* Don't recreate indices on every test (#4222)

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Enable pushdown optimization for filtered aggregation (#4213)

* Enable filtered aggregation pushdown

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add basic UT and ignore IT for now

Signed-off-by: Chen Dai <daichen@amazon.com>

* Enable aggregate case to filter rule and fix UT and IT

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add expected json file for no pushdown test

Signed-off-by: Chen Dai <daichen@amazon.com>

* Remove unnecessary aggregate case to filter rule

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add UT for IS_TRUE support

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add more explain IT

Signed-off-by: Chen Dai <daichen@amazon.com>

* Refactor UT

Signed-off-by: Chen Dai <daichen@amazon.com>

* Extract aggregate filter analyzer abstraction

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add more UT

Signed-off-by: Chen Dai <daichen@amazon.com>

* Refactor UT with fluent API

Signed-off-by: Chen Dai <daichen@amazon.com>

* Add UT for distinct count

Signed-off-by: Chen Dai <daichen@amazon.com>

* Address comment by adding UT for script filter pushdown

Signed-off-by: Chen Dai <daichen@amazon.com>

* Fix spotless

Signed-off-by: Chen Dai <daichen@amazon.com>

---------

Signed-off-by: Chen Dai <daichen@amazon.com>

* Split up our test actions into unit, integ, and doctest. (#4193)

* Run unit test suites in parallel

Signed-off-by: Simeon Widdis <sawiddis@gmail.com>

* Split out our test actions

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Make unit test step run in parallel

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Fix removed bwc tests

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Add another missing parallel flag

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

---------

Signed-off-by: Simeon Widdis <sawiddis@gmail.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* [Feature] Core Implementation of `rex` Command In PPL (#4109)

* rex - initial implementation

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* stop using utils

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix spotless check

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* offset_field - initial implementation

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* max_match - initial implementation

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* sed - initial implementation

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix name capture group for extraction

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* add rex rst doc

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* IT - initial setup

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* add a analyzer test for legacy engine

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add UT for rex

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* sed - add pushdown for sed and explain IT and IT with fix

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* anonymizer - add rex for anonymizer and test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add cross cluster IT for rex

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - resolve comments for rst doc 0

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - address some comments 1

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - resolve comment in rst doc to add a java doc link

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* kai - modify the bin ast builder test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - fix the extraction behavior without filter even when there is zero match

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix rex explain no pushdown

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* change the offset val output format

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix rst file

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - SWITCH TO USE CALCITE NATIVE OPERATORS

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Peng - fix tests after operator change

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* support mode=extract and update doc

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix the issue after rebase

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - enforce specifying field in antlr for now

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* relocate rex cmd IT

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - simplify vistFunciton

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - add UT for RexExtractMultiFunction

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - add UT RexOffsetFunction

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix some tests

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* DECOUPLE SED + OFFSET FIELD

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Improve error handling for extract

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* add this rex rst into index

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix return type in extract multi

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* add rex doc into doc test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix doc test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Fix linting

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix rebase issue

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix regex anonymizer tests

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix analyzer test and setup to use util function

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* lint fix

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix doc test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add max match limit implementation

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* fix anonymizer test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - simplify if

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - make extract multi to only handle the case of max_match > 1

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

---------

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add wildcard support for rename command (#4019)

* add wildcard support for rename

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix calcite wildcard support and add tests

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add check to analyzer

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update doc formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* remove v2 engine wildcard support

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update doc

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* support cascading rename

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add cross cluster test

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add test for cascading rename

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add test for cascading rename

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* change behavior for renaming existing fields

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add tests and update docs

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update docs

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update docs

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix renaming to same name

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix behavior for consecutive wildcards/address comments

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add back import

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix doc

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix doc

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

---------

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

* Add support for `median(<value>)` (#4234)

* First revision

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixing documentation

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Removing unnecessary comments

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixinf stats.rst documentation

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixing documentation

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Addressing comments

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

---------

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Co-authored-by: Aaron Alvarez <aaarone@amazon.com>

* Dynamic source selector (#4116)

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add gitignore (#4258)

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Support join field list and join options (#3803)

* Support join field list and join options

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Add SPL-compatible syntax setting

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Revert SPL settings

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Support max=n option

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* support max=n in sql-like join syntax

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Add Explain IT for new join syntax

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Refactor the user doc

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix conflicts

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix conflicts

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Disable the collapse pushdown

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* refactor

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Support first/last aggregate functions for PPL (#4223)

* Support first/last aggregation functions for PPL

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Support null

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* remove legacy

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* update doc

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix doctest

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix stats.rst file

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fixes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* move pushdown logic to AggregateAnalyzer

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix IT and update null handling

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* add test cases for null handling

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* handle parallelism

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Simplify CalciteExplainIT and add UT for AggregateAnalyzer

Signed-off-by: Kai Huang <ahkcs@amazon.com>

# Conflicts:
#	opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java

* fixes

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Fix gitignore to ignore symbolic link (#4263)

add comment

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Push down limit operator into aggregation bucket size (#4228)

* Push down limit operator into aggregation bucket size

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix IT

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix robust issue in OpenSearchLimitIndexScanRule

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Refine comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix the IT issue caused by merging conflict (#4270)

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Print links to test logs after integTest (#4273)

* Print links to test logs after integTest

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* print even when tets failed

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

---------

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* [Feature] Implementation of mode `sed` and `offset_field` in rex PPL command (#4241)

* [Feature] Implementation of mode sed and offset_field in rex PPL command

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* update rex rst doc

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - address comment and merge grammar in parser

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - limit offset field only in extraction mode

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - specify exception type of o_f UDF

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - add exception type of o_f UDF - 2

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - add exception type of o_f UDF - also fix the test

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* chen - alphabetical order of o_f return

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

---------

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add earliest/latest aggregate function for eventstats PPL command (#4212)

* Add earliest/latest aggregate function for eventstats command

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* update docs

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Minor refactoring

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix doctest

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Simplify logics

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Revert visitWindowFunction

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Add sort to some examples

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Refactor tests

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix argument validation error (WIP)

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Add argument validation for window functions

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix validation

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix tests

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix tests and refactor

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix test

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix merge issue

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

---------

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Speed up aggregation pushdown for single group-by expression (#3550)

* Speed up aggregation pushdown for single group-by expression

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Add configs nullable_bucket

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* revert typo

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix conflicts error

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix unit tests

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix order

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix UT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix UT in windows

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix compile error of conflicts

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Add more ITs after merging push down limit to agg buckets

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* address comments

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Clear sorts in source builder for aggregation pushdown

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Delete the TODO of v2, it's resolved now

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix doctest

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Introduce YAML formatter for better testing/debugging (#4274)

* Implement YamlFormatter

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Enable YAML based plan comparison in tests

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix line break issue in Windows

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Minor fix in test case

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix line break issue

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* Fix comment

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

---------

Signed-off-by: Tomoyuki Morita <moritato@amazon.com>

* doctest: Use 1.0 branch instead of main (#4219)

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Fix doctest (#4292)

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Search Command Revamp (#4152)

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* `mvjoin` support in PPL Caclite (#4217)

* mvjoin support in PPL Caclite

Signed-off-by: ps48 <pshenoy36@gmail.com>

* fix texts

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update docs

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update doc examples

Signed-off-by: ps48 <pshenoy36@gmail.com>

* rebase main, update test

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update test with real array fields

Signed-off-by: ps48 <pshenoy36@gmail.com>

* use verifyQueryThrowsException in CalcitePPLFunctionTypeTest

Signed-off-by: ps48 <pshenoy36@gmail.com>

* spotless check fix

Signed-off-by: ps48 <pshenoy36@gmail.com>

* remove string,string registration for mvjoin

Signed-off-by: ps48 <pshenoy36@gmail.com>

* remove string,string test

Signed-off-by: ps48 <pshenoy36@gmail.com>

---------

Signed-off-by: ps48 <pshenoy36@gmail.com>

* strftime function implementation (#4106)

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* Add non-numeric field support for max/min functions (#4281)

* add non-numeric support for max/min

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix mixed field behavior

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update doc

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update doc

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* update formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* add tests

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* empty

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* support ip type max/min

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix formatting

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* use tophitsparser

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* remove v2 explain

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* check for numeric fields for native max/min

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* change names

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

* fix type checking

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

---------

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

* Add  `values` stats function with UDAF (#4276)

* Add  stats function

Signed-off-by: ps48 <pshenoy36@gmail.com>

* add settings for max values

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update functiontypetest IT

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update documentation for values settings

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update the rst docs, remove settingsholder

Signed-off-by: ps48 <pshenoy36@gmail.com>

* update AST additions

Signed-off-by: ps48 <pshenoy36@gmail.com>

* updated the IT testValuesFunctionGroupBy

Signed-off-by: ps48 <pshenoy36@gmail.com>

---------

Signed-off-by: ps48 <pshenoy36@gmail.com>

* Support ISO8601-formatted string in PPL (#4246)

* Support parsing ISO 8601 datetime format for timestamp value

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Modify tests for ISO 8601 timestamp input

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add support of iso 8601 date string to date and time

- add an IT for date time comparison with iso 8601 formatted literal

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Push down project operator with non-identity projections into scan (#4279)

* Support project push down after aggregation

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Push down project operator with non-identity projections into scan

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix IT

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Also changing plan from merging main

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix IT

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix 4296

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add spotless precommit hook + license check (#4306)

* Add spotless precommit hook

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Decouple plugin spotless versions + upgrade spotless

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Enable license headers everywhere

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Remove a redundant comment

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Fix removed additional licenses

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

---------

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Add Ryan as a maintainer (#4257)

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>

* Spotless precommit: apply instead of check (#4320)

* Add merge_group trigger to test workflows (#4216)

* Update grammar files and developer guide (#4301)

* Update grammar files and developer guide

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* fix

Signed-off-by: Kai Huang <ahkcs@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>

* Fix geopoiint issue in complex data types (#4325)

Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>

* [Doc] Correct the comparision table for rex doc (#4321)

* [Doc] Correct the comparision table for rex doc

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* peng - remove non support feature from comparison table

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

---------

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

* Add splunk to ppl cheat sheet (#3726)

* update with latest ppl commands and function improvement

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Address comments

Signed-off-by: Peng Huo <penghuo@gmail.com>

---------

Signed-off-by: Peng Huo <penghuo@gmail.com>

* Date/Time based Span aggregation should always not present null bucket (#4327)

* Updating coalesce documentation (#4305)

Co-authored-by: Aaron Alvarez <aaarone@amazon.com>

* Support serializing & deserializing UDTs when pushing down scripts (#4245)

* Support serializing & deserializing UDTs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Update explain ITs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Push down UDT types as string types for comparison operators

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Separate test cases and add an ignored IT

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Correct the handling of UDT in CalciteScriptEngine by substituting calcite's type factory with OpenSearchTypeFactory

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Fix deserialization for IP

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove testExplainPushDownScriptsContainingUDT in v2

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Enable testLimitAfterAggregation in CalcitePPLAggregationIT

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Unit test serialize map and array types

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Fix deeper level deserialization of UDTs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a yaml test for issue 4322

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Add a test case for issue 4340

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove redundant classes

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* change Anonymizer to mask PPL (#4352)

* change Anonymizer

Signed-off-by: xinyual <xinyual@amazon.com>

* fix case

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>

* [Feature][Enhancement] Enhance patterns command with additional sample_logs output field (#4155)

* Enhance patterns command with additional sample_logs output field

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Reorder agg fields for simple_pattern

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Test fix after previous fix to not drop group by list

Signed-off-by: Songkan Tang <songkant@amazon.com>

---------

Signed-off-by: Songkan Tang <songkant@amazon.com>

* Optimize count aggregation performance by utilizing native doc_count in v3 (#4337)

* Optimize bucket aggregation performance by utilizing native doc_count in v3

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix UT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix issue of count(FIELD)

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* fix comments

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix typo

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* revert the doc_count pushdown for count(FIELD) by EXPR

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Support pushdown count aggregation in no bucket aggregation to hits.total.value

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* No index found with given index pattern should throw IndexNotFoundException (#4369)

* No index found with given index pattern should throw IndexNotFoundException

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Add UT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Push down stats with bins on time field into auto_date_histogram (#4329)

* Push down stats with bins on time field into auto_date_histogram

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Prevent pushing down multiple group-by with bins in advance.

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Remove useless code

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Fix IT after merging main

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Kai Huang <105710027+ahkcs@users.noreply.github.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@gmail.com>
Signed-off-by: Jialiang Liang <jiallian@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Signed-off-by: Vamsi Manohar <reddyvam@amazon.com>
Signed-off-by: Tomoyuki Morita <moritato@amazon.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: ps48 <pshenoy36@gmail.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Co-authored-by: Kai Huang <105710027+ahkcs@users.noreply.github.com>
Co-authored-by: Songkan Tang <songkant@amazon.com>
Co-authored-by: Simeon Widdis <sawiddis@gmail.com>
Co-authored-by: Chen Dai <daichen@amazon.com>
Co-authored-by: Jialiang Liang <jiallian@amazon.com>
Co-authored-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
Co-authored-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
Co-authored-by: Vamsi Manohar <reddyvam@amazon.com>
Co-authored-by: Tomoyuki MORITA <moritato@amazon.com>
Co-authored-by: Lantao Jin <ltjin@amazon.com>
Co-authored-by: qianheng <qianheng@amazon.com>
Co-authored-by: Shenoy Pratik <sgguruda@amazon.com>
Co-authored-by: Yuanchun Shen <yuanchu@amazon.com>
Co-authored-by: Peng Huo <penghuo@gmail.com>
Co-authored-by: Xinyuan Lu <xinyual@amazon.com>
@songkant-aws songkant-aws deleted the append-command branch October 9, 2025 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Support Append command in PPL

4 participants