[RFC] Add `reverse` Command to PPL (Calcite Engine)

**Problem Statement**
In many analytical and observability queries, users want to inspect data in reverse order, for example:
- Viewing the most recent log lines last instead of first.
- Extracting the last N rows in the dataset.
- Reversing the effect of a sort without re-specifying fields.
Currently, PPL does not support a way to reverse the order of results in a query pipeline. This limits query flexibility and forces users to use workarounds or rely on external sorting.

**Current State**
- PPL supports `sort`, which orders records by specified fields, but does not offer a simple way to reverse record order without explicitly specifying sort criteria again
- There is no current native support for a reverse operation in either Calcite or Spark engines

**Long-Term Goals**
- Provide a simple, intuitive mechanism (reverse) for users to invert row order in a PPL pipeline.
- Establish clear guidelines and patterns for lightweight, non-parametric commands in PPL.
- Avoid introducing performance bottlenecks by enforcing pushdown optimization or limiting use cases on large datasets.

**Proposal**
#### Syntax
```
<source> | [commands] | reverse | [commands]
```
- `reverse` takes no arguments.
- It can appear anywhere in the pipeline including the end, and will reverse the order of rows produced up to that point.
- Unlike `sort`, which allows field-based sorting, `reverse` simply inverts the row order as is.

#### Semantics
**Expected Behavior:**
- **Action**: Reverses the display order of all rows in the current result set
- **Scope**: Operates on the entire result set at the point where `reverse` appears in the pipeline
- **Data Preservation**: Does not modify field values or schema

**Alternative**
- **`sort -` requires field**: helpful but doesn't support reversing arbitrary pipeline outputs, and would require re-specifying field names.
- `reverse` works even when no sort is specified, or when you're working with filtered, projected, or unsorted data.

**Implementation Discussion**
**_Default Implementation:_** To implement reverse, we insert a logical sort with a synthetic column based on `ROW_NUMBER()`:
```
LogicalSort(reverse_row_num DESC)
  Project(..., reverse_row_num = ROW_NUMBER() OVER ())
    [upstream operator]
```
_**Pushdown Implementation: (TBD, will be in 2nd followup PR)**_ 
The optimization is triggered when the logical plan has this exact shape:
```
LogicalSort (reverse sort)
  CalciteLogicalIndexScan
```
Conditions for pushdown [not implemented yet]:
- Reverse should be able to be pushed down when preceded by sort by flipping the direction of the sort from asc to desc or vice versa.
- However, pushdown is disabled if a filter or limit appears before reverse, or if the reverse requires a local fallback implementation using ROW_NUMBER().

**Example Pushdown Optimization:** 
```
curl -X POST "localhost:9200/_plugins/_ppl/_explain" \
-H "Content-Type: application/json" \
-d '{
  "query": "source = employees | reverse | head 5"
}'
```
If reverse appears directly after the source, it matches this optimized pattern:
```
Sort DESC (fetch=5)
  IndexScan
```
This enables pushdown of the `reverse` and `head` commands into the scan phase.
```
   LogicalProject(skills=[$0], performance_score=[$1], level=[$2], employee_id=[$3], name=[$4], hire_date=[$5], department=[$6], remote=[$7], salary=[$8], age=[$9], email=[$10])
  LogicalSort(sort0=[$11], dir0=[DESC], fetch=[5])
    CalciteLogicalIndexScan(table=[[OpenSearch, employees]])
```                        

#### Engine Support
- Currently implemented only in Calcite engine, not supported in Spark engine
- Once [FEATURE] Export PPL-Calcite Engine as Reusable Library ([#3734](https://github.com/opensearch-project/sql/issues/3734)) is completed, `reverse` can be reused in the Spark engine.

#### Performance
- When reverse follows a sort on large datasets (ex: 20K) the sort is applied to all 20K values, then the reverse is applied to all 20K values properly, then on that final result OpenSearch returns a limited 10K values due to the automatically imposed limit, as if head 10000 was applied after the reverse. 

#### Open Question: Semantics When `sort` Precedes `reverse` 
**Query:**
```
curl -X POST "localhost:9200/_plugins/_ppl/_explain" \
-H "Content-Type: application/json" \
-d '{
  "query": "source = employees | sort salary | reverse"
}'
```
**Logical Plan:**
```
LogicalProject(...)
  LogicalSort(sort0=[$17], dir0=[DESC])
    LogicalProject(..., reverse_row_num=[ROW_NUMBER() OVER ()])
      LogicalSort(sort0=[$8], dir0=[ASC-nulls-first])
        CalciteLogicalIndexScan(table=[[OpenSearch, employees]])
```
**Answer:** The first result returned is the highest salary as expected, which is 20000 in this case. 
source = employees | sort salary | reverse | head 1 will return the item with salary = 20000. We delay head or limit pushdown until after reverse, ensuring the entire result set is available to reverse when appropriate, avoiding truncated or misleading results. 

**Open Question**
- Should `reverse` pushdown be disabled or restricted when preceded by `sort` or multiple sorts to avoid partial results and incorrect output?
- Should usage of `reverse` be limited to queries bounded by `head` or similar limits?
- Should a configuration flag exist to disable or warn about `reverse` usage on large unbounded result sets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Add `reverse` Command to PPL (Calcite Engine) #3873

Syntax

Semantics

Engine Support

Performance

Open Question: Semantics When `sort` Precedes `reverse`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Add reverse Command to PPL (Calcite Engine) #3873

Description

Syntax

Semantics

Engine Support

Performance

Open Question: Semantics When sort Precedes reverse

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Add `reverse` Command to PPL (Calcite Engine) #3873

Open Question: Semantics When `sort` Precedes `reverse`