Skip to content

[BUG] stats command fails when PERCENTILE() or MEDIAN() is combined with MIN() or MAX() #4460

@alexey-temnikov

Description

@alexey-temnikov

Query Information

PPL Command/Query:

source=opensearch_dashboards_sample_data_logs 
| eval RamMB=CEIL(machine.ram/1000000.0) 
| stats bucket_nullable=false MIN(RamMB) as Min_RAM_MB, percentile(RamMB, 80) as P80_RAM_MB

Expected Result:
The query should return both the minimum value and the 80th percentile of the RamMB field in a single result row, similar to how MIN and MAX work together.

Actual Result:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: Index 1 out of bounds for length 0",
    "type": "RuntimeException"
  },
  "status": 500
}

Dataset Information

Dataset/Schema Type

  • OpenTelemetry (OTEL)
  • Simple Schema for Observability (SS4O)
  • Open Cybersecurity Schema Framework (OCSF)
  • Custom (OpenSearch Dashboards sample data)

Index Mapping

{
  "mappings": {
    "properties": {
      "machine": {
        "properties": {
          "ram": { "type": "long" },
          "os": { "type": "text" }
        }
      }
    }
  }
}

Sample Data

{
  "machine": {
    "ram": 8589934592,
    "os": "win 8"
  }
}

Bug Description

Issue Summary:
The stats command fails with an "Index out of bounds" error when combining PERCENTILE() or MEDIAN() functions with standard aggregation functions like MIN() or MAX() in the same stats command.

Steps to Reproduce:

  1. Execute: source=opensearch_dashboards_sample_data_logs | eval RamMB=CEIL(machine.ram/1000000.0) | stats MIN(RamMB), percentile(RamMB, 80)
  2. Observe error: "Index 1 out of bounds for length 0"
  3. Execute with MAX instead: source=opensearch_dashboards_sample_data_logs | eval RamMB=CEIL(machine.ram/1000000.0) | stats MAX(RamMB), percentile(RamMB, 80)
  4. Same error occurs
  5. Execute with MEDIAN: source=opensearch_dashboards_sample_data_logs | eval RamMB=CEIL(machine.ram/1000000.0) | stats MIN(RamMB), median(RamMB)
  6. Same error occurs

Verified Working Scenarios:

  • stats MIN(RamMB), MAX(RamMB) - ✅ Works
  • stats percentile(RamMB, 80) - ✅ Works
  • stats median(RamMB) - ✅ Works

Impact:
Users cannot combine approximate aggregation functions (percentile, median) with standard aggregations (MIN, MAX) in a single stats command, requiring multiple separate queries and manual result merging.

Environment Information

OpenSearch Version:
OpenSearch 3.3.0-SNAPSHOT

Additional Details:

  • Bug affects both PPL percentile() and median() functions
  • Bug occurs regardless of aggregation order
  • No test coverage exists for combining percentile/median with other aggregations

Tentative Root Cause Analysis

Very preliminary, the issue appears to be in the aggregation argument processing pipeline:

File: ./opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java

Line 486-487:

case PERCENTILE_APPROX -> {
  PercentilesAggregationBuilder aggBuilder =
      helper
          .build(args.getFirst(), AggregationBuilders.percentiles(aggFieldName))
          .percentiles(helper.inferValue(args.get(1), Double.class));  // Line 487

The code assumes args.get(1) exists, but when percentile is combined with other aggregations, the Calcite query optimizer may restructure the arguments differently, resulting in an empty or incorrectly sized args list for the percentile aggregation.

Related File: ./opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/SinglePercentileParser.java

Line 30:

Streams.stream(((Percentiles) agg).iterator()).findFirst().get().getValue());

The .get() call without checking if a value is present could also contribute to failures if the percentile aggregation returns no results.

Tentative Proposed Fix

This is a preliminary analysis and requires further investigation:

  1. Add defensive checks in AggregateAnalyzer.java to validate args list size before accessing indices
  2. Investigate Calcite optimization to understand why args are restructured when multiple aggregations are present
  3. Add null-safety in SinglePercentileParser.java using .orElse() or .orElseThrow() with a descriptive error
  4. Add integration tests in ./integ-test/src/test/java/org/opensearch/sql/ppl/StatsCommandIT.java to cover combinations of percentile/median with other aggregations

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagebugSomething isn't workingcalcitecalcite migration releatedpushdownpushdown related issues

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions