Skip to content

[BUG] PPL date comparisons fail on epoch_millis fields - ISO 8601 string sent instead of numeric value #4490

@alexey-temnikov

Description

@alexey-temnikov

Query Information

PPL Command/Query:

source=jaeger-span-2024-05-08 | eval example_time = STR_TO_DATE('1987-02-03 12:34:56', '%Y-%m-%d %H:%i:%S') | where startTimeMillis > example_time

Expected Result:
The query should return all documents where startTimeMillis (epoch milliseconds) is greater than the timestamp.

Actual Result:

{
  "error": {
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: OpenSearchParseException[failed to parse date field [1987-02-03T12:34:56.000Z] with format [epoch_millis]: [failed to parse date field [1987-02-03T12:34:56.000Z] with format [epoch_millis]]]; nested: IllegalArgumentException[failed to parse date field [1987-02-03T12:34:56.000Z] with format [epoch_millis]]; nested: DateTimeParseException[Failed to parse with all enclosed parsers]",
    "type": "SearchPhaseExecutionException"
  },
  "status": 400
}

Full Stack Trace:

Caused by: org.opensearch.OpenSearchParseException: failed to parse date field [1987-02-03T12:34:56.000Z] with format [epoch_millis]
	at org.opensearch.common.time.JavaDateMathParser.parseDateTime(JavaDateMathParser.java:247)
	at org.opensearch.common.time.JavaDateMathParser.parse(JavaDateMathParser.java:89)
	at org.opensearch.index.mapper.DateFieldMapper$DateFieldType.parseToLong(DateFieldMapper.java:621)
	at org.opensearch.index.mapper.DateFieldMapper$DateFieldType.isFieldWithinQuery(DateFieldMapper.java:677)
	at org.opensearch.index.query.RangeQueryBuilder.getRelation(RangeQueryBuilder.java:462)
	at org.opensearch.index.query.RangeQueryBuilder.doRewrite(RangeQueryBuilder.java:487)
	at org.opensearch.index.query.AbstractQueryBuilder.rewrite(AbstractQueryBuilder.java:306)
	at org.opensearch.search.builder.SearchSourceBuilder.rewrite(SearchSourceBuilder.java:1187)

Workaround:
Adding head command before the filter prevents query pushdown and executes the comparison in-memory:

source=jaeger-span-2024-05-08 | head 999999 | eval example_time = STR_TO_DATE('1987-02-03 12:34:56', '%Y-%m-%d %H:%i:%S') | where startTimeMillis > example_time

Dataset Information

Dataset/Schema Type

  • OpenTelemetry (OTEL)
  • Simple Schema for Observability (SS4O)
  • Open Cybersecurity Schema Framework (OCSF)
  • Custom (details below)

Index Mapping

{
  "mappings": {
    "properties": {
      "startTimeMillis": {
        "type": "date",
        "format": "epoch_millis"
      },
      "startTime": {
        "type": "long"
      },
      "operationName": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  }
}

Sample Data

{
  "startTimeMillis": 1715126504378,
  "startTime": 1715126504378733,
  "operationName": "oteldemo.ProductCatalogService/GetProduct"
}

Bug Description

Issue Summary:
PPL queries that compare date fields with epoch_millis format against STR_TO_DATE() results fail when the filter is pushed down to OpenSearch. The query optimizer incorrectly converts the timestamp comparison value to an ISO 8601 string format ("1987-02-03T12:34:56.000Z") instead of epoch milliseconds, causing OpenSearch to reject the query since the field expects numeric epoch_millis values.

Steps to Reproduce:

  1. Create an index with a date field using epoch_millis format:
PUT /test-date-comparison
{
  "mappings": {
    "properties": {
      "timestamp_field": {
        "type": "date",
        "format": "epoch_millis"
      }
    }
  }
}
  1. Insert sample data:
POST /test-date-comparison/_doc
{
  "timestamp_field": 1715126504378
}
  1. Execute PPL query with date comparison:
POST /_plugins/_ppl
{
  "query": "source=test-date-comparison | eval example_time = STR_TO_DATE('1987-02-03 12:34:56', '%Y-%m-%d %H:%i:%S') | where timestamp_field > example_time"
}
  1. Observe the error about failing to parse ISO 8601 date string with epoch_millis format

Impact:

  • Users cannot perform date comparisons on fields with epoch_millis, epoch_second, or epoch_micros formats when using STR_TO_DATE() or other date functions in PPL
  • Workaround requires adding head command which forces in-memory execution, negating query optimization benefits and potentially causing performance issues with large datasets

Environment Information

OpenSearch Version:
OpenSearch 3.3.0-SNAPSHOT (also affects earlier versions)

Additional Details:

Root Cause Analysis

This is a preliminary analysis and requires further investigation.

The stack trace and code analysis reveal the exact failure point:

Failure Flow:

  1. SQL plugin's LuceneQuery.value() method formats the timestamp as ISO 8601 string: "1987-02-03T12:34:56.000Z"
  2. This string is passed to RangeQueryBuilder as the comparison value
  3. During query rewrite, OpenSearch's DateFieldMapper.parseToLong() attempts to parse this string
  4. JavaDateMathParser tries to parse the ISO 8601 string using the field's format (epoch_millis)
  5. Parsing fails because epoch_millis expects a numeric string (e.g., "539179496000"), not an ISO 8601 formatted string

Affected Code:

  • File: opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/filter/lucene/LuceneQuery.java
  • Method: value(ExprValue literal, ExprType fieldType)

Current Logic (Incorrect):

protected Object value(ExprValue literal, ExprType fieldType) {
  if (fieldType instanceof OpenSearchDateType) {
    OpenSearchDateType openSearchDateType = (OpenSearchDateType) fieldType;
    if (literal.type().equals(ExprCoreType.TIMESTAMP)) {
      return openSearchDateType.hasNoFormatter()
          ? literal.timestampValue().toEpochMilli()
          : openSearchDateType.getFormattedDate(literal.timestampValue());  // ← Returns ISO 8601 string
    }
  }
  return literal.value();
}

Problem:
When a field has format epoch_millis, hasNoFormatter() returns false (because formats list is not empty), so the code calls getFormattedDate() which returns an ISO 8601 string. However, OpenSearch expects a numeric value for epoch_millis format.

Explain Plan Evidence:

"query":{"range":{"startTimeMillis":{"from":"1987-02-03T12:34:56.000Z",...}}}

Proposed Solution

This is a preliminary analysis and requires further investigation.

The possible fix could be modify the LuceneQuery.value() method to detect numeric date formats and return numeric values instead of formatted strings.

Proposed Code Change:

protected Object value(ExprValue literal, ExprType fieldType) {
  if (fieldType instanceof OpenSearchDateType) {
    OpenSearchDateType openSearchDateType = (OpenSearchDateType) fieldType;
    if (literal.type().equals(ExprCoreType.TIMESTAMP)) {
      // Return numeric value if field has no formatter OR uses numeric format
      if (openSearchDateType.hasNoFormatter() 
          || openSearchDateType.getNumericNamedFormatters().size() > 0) {
        return literal.timestampValue().toEpochMilli();
      } else {
        return openSearchDateType.getFormattedDate(literal.timestampValue());
      }
    }
    // Similar changes for DATE and TIME types
  }
  return literal.value();
}

Key Changes:

  1. Check if the field uses numeric formats via getNumericNamedFormatters().size() > 0
  2. Return numeric epoch milliseconds for numeric formats
  3. Only use getFormattedDate() for non-numeric date formats

Additional Considerations:

  • For epoch_second format, may need to convert: literal.timestampValue().toEpochMilli() / 1000
  • For epoch_micros format, may need to convert: literal.timestampValue().toEpochMilli() * 1000
  • Need to handle mixed format strings (e.g., "epoch_millis||strict_date_optional_time")

Related Issues

Screenshots

N/A - Error message and stack trace are text-based and included above.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpushdownpushdown related issues

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions