Skip to content

Conversation

@yuancu
Copy link
Owner

@yuancu yuancu commented Jul 17, 2025

Description

Note: this PR is copied from that of opensearch sql repo. It is to let copilot review the code first.

Implement the reverse command in PPL to flip the result order of records.
Syntax

| [commands] | reverse | [commands]
reverse takes no arguments.
It can appear anywhere in the pipeline including the end, and will reverse the order of rows produced up to that point.
Unlike sort, which allows field-based sorting, reverse simply inverts the row order as is.

Semantics

Expected Behavior:

Action: Reverses the display order of all rows in the current result set
Scope: Operates on the entire result set at the point where reverse appears in the pipeline
Data Preservation: Does not modify field values or schema

Implementation Approach:
Assigns sequential numbers to each row, then sorts by those numbers in descending order to achieve proper reversal regardless of existing sort operations.
Example Queries

-- Reverse original ingestion order
source=accounts | reverse

-- Reverse after sorting by age (youngest to oldest → oldest to youngest)
source=accounts | sort age | reverse

-- Get the last 5 records in the original order
source=accounts | reverse | head 5

-- Reverse after filtering and projecting fields
source=accounts | fields name, age | reverse

Restrictions / Limitations

Calcite Engine Only: Requires plugins.query.calcite.enabled=true
Legacy Engine: Throws UnsupportedOperationException when Calcite is disabled
Memory Overhead: Uses row numbering and sort which may impact performance on very large result sets

Selina Song added 5 commits July 10, 2025 09:51
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
@yuancu yuancu requested a review from Copilot July 17, 2025 02:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for the PPL reverse command by extending the parser, AST, planning, and execution layers, and provides end-to-end tests and documentation.

  • Define new Reverse AST node and wire it through the ANTLR grammar and visitor infrastructure
  • Implement Calcite planning logic to assign row numbers, sort descending, and strip the helper column
  • Update optimizer rules to avoid pushing down limits when reversing and add unit/integration tests plus docs

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
CalcitePPLReverseTest.java Unit tests covering reverse behavior under various pipelines
PPLQueryDataAnonymizer.java Anonymizer formatting for reverse in EXPLAIN output
AstBuilder.java Parser rule and AST creation for reverse
OpenSearchPPLParser.g4 Grammar addition for reverseCommand
OpenSearchLimitIndexScanRule.java Skip limit pushdown when reverse is present
CalciteReverseCommandIT.java Integration tests against a real index
reverse.rst User documentation for the reverse command
CalciteRelNodeVisitor.java Calcite plan generation for reverse
Reverse.java Definition of the Reverse AST node
AbstractNodeVisitor.java Visitor interface updated with visitReverse
Analyzer.java Semantic analysis error when Calcite is disabled
Comments suppressed due to low confidence (1)

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLReverseTest.java:210

  • [nitpick] Indentation for the text block in testReverseMultipleSortsSplEquivalent differs from other tests. Align the content indentation consistently so that the expected string does not include unintended leading whitespace.
        String expectedResult = """

Comment on lines +54 to +56
// Check if the sort has a row_number function in its digest
String digest = sort.getDigest();
return digest != null && digest.contains("row_number");
Copy link

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation of hasRowNumberFunction relies on searching for 'row_number' in the sort digest, which is brittle. Consider inspecting the sort node's collation fields or checking for a preceding Project that introduces the 'reverse_row_num' column instead of parsing the digest string.

Suggested change
// Check if the sort has a row_number function in its digest
String digest = sort.getDigest();
return digest != null && digest.contains("row_number");
// Check collation fields for row_number function
if (sort.getCollation() != null) {
return sort.getCollation().getFieldCollations().stream()
.anyMatch(fieldCollation -> fieldCollation.getFieldName().equalsIgnoreCase("row_number"));
}
// Check for preceding Project node introducing __reverse_row_num__ column
if (sort.getInput() instanceof LogicalProject) {
LogicalProject project = (LogicalProject) sort.getInput();
return project.getProjects().stream()
.anyMatch(rexNode -> rexNode.toString().contains("__reverse_row_num__"));
}
return false;

Copilot uses AI. Check for mistakes.
Comment on lines +364 to +369
.over().rowsTo(RexWindowBounds.CURRENT_ROW).as("__reverse_row_num__");
context.relBuilder.projectPlus(rowNumber);
// Sort by row number descending
context.relBuilder.sort(context.relBuilder.desc(context.relBuilder.field("__reverse_row_num__")));
// Remove row number column
context.relBuilder.projectExcept(context.relBuilder.field("__reverse_row_num__"));
Copy link

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alias 'reverse_row_num' is used as a magic string. Extract this into a shared constant to improve maintainability and avoid typos when referencing it in multiple places.

Suggested change
.over().rowsTo(RexWindowBounds.CURRENT_ROW).as("__reverse_row_num__");
context.relBuilder.projectPlus(rowNumber);
// Sort by row number descending
context.relBuilder.sort(context.relBuilder.desc(context.relBuilder.field("__reverse_row_num__")));
// Remove row number column
context.relBuilder.projectExcept(context.relBuilder.field("__reverse_row_num__"));
.over().rowsTo(RexWindowBounds.CURRENT_ROW).as(REVERSE_ROW_NUM_ALIAS);
context.relBuilder.projectPlus(rowNumber);
// Sort by row number descending
context.relBuilder.sort(context.relBuilder.desc(context.relBuilder.field(REVERSE_ROW_NUM_ALIAS)));
// Remove row number column
context.relBuilder.projectExcept(context.relBuilder.field(REVERSE_ROW_NUM_ALIAS));

Copilot uses AI. Check for mistakes.
@yuancu yuancu closed this Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants