Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SQL query doesn't honor date format in OpenSearch index mapping #794

Closed
dai-chen opened this issue Aug 30, 2022 · 9 comments · Fixed by #1314
Closed

[BUG] SQL query doesn't honor date format in OpenSearch index mapping #794

dai-chen opened this issue Aug 30, 2022 · 9 comments · Fixed by #1314
Labels
bug Something isn't working legacy Issues related to legacy query engine to be deprecated Priority-High SQL v2.9.0 v2.9.0

Comments

@dai-chen
Copy link
Collaborator

What is the bug?
It seems SQL query engine doesn't honor what's configured in OpenSearch index mapping for date field. This causes problems in different queries with datetime field involved. See examples below.

How can one reproduce the bug?

As documented, "strict_date_optional_time||epoch_millis" is the default format if not specified in index mapping. The issue happens when custom date format is configured as below. Note that this is mostly due to the gaps between engine v2 and the legacy (which may not have these issues at all).

Issue 1: Datetime literal parsing problem

With epoch_millis format removed in mapping, the previous work query throws exception now. From the error message, it seems caused by epoch timestamp used in DSL translated rather than the only strict_date_optional_time configured. Note that OpenSearch doesn't complain this in any syntax/semantic check (probably due to missing semantic check), but throw exception at execution time instead.

PUT my-index-000002
{
  "mappings": {
    "properties": {
      "date": {
        "type": "date",
        "format": "strict_date_optional_time"
      }
    }
  }
}

PUT my-index-000002/_doc/3
{ "date": "2015-01-01T12:10:30Z" }

POST _plugins/_sql
{
  "query": "SELECT * FROM my-index-000002 WHERE `date` < '2022-08-20 23:59:59.999' "
}

{
  "error": {
    "type": "SearchPhaseExecutionException",
    "reason": "Error occurred in OpenSearch engine: all shards failed",
    "details": "Shard[0]: OpenSearchParseException[failed to parse date field [1661039999999] with format [strict_date_optional_time]: [failed to parse date field [1661039999999] with format [strict_date_optional_time]]]; nested: IllegalArgumentException[failed to parse date field [1661039999999] with format [strict_date_optional_time]]; nested: NotSerializableExceptionWrapper[date_time_parse_exception: Text '1661039999999' could not be parsed at index 0];\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine."
  },
  "status": 503
}

Issue 2: Datetime value parsing problem

Related: #126, opendistro-for-elasticsearch/sql#1062

No matter what date format configured, OpenSearchExprValueFactory always uses the hardcoding formatter in

. This causes date value parsing error or wrong timezone.

What is the expected behavior?
OpenSearch SQL/PPL should honor the date format in index mapping and parse date value from OpenSearch or date literals in query accordingly.

Currently only data type is returned and associated with field. One approach to improve this is reading datetime format from OpenSearch along with basic field type info. Code:

What is your host/environment?

  • OpenSearch 2.2
  • Plugins: SQL

Do you have any screenshots?
N/A

Do you have any additional context?
Similar issues may apply to PPL as well due to single core engine shared across languages.

@dai-chen dai-chen added bug Something isn't working Severity-Critical SQL legacy Issues related to legacy query engine to be deprecated labels Aug 30, 2022
@Yury-Fridlyand
Copy link
Collaborator

Isn't a duplicate/related to #126?

@dai-chen
Copy link
Collaborator Author

Isn't a duplicate/related to #126?

I added it to Issue 2. Issue 1 is slightly different. I think both has same root cause.

@Yury-Fridlyand
Copy link
Collaborator

Mapping:

{
    "mappings" : {
        "properties" : {
            "key" : {
                "type" : "keyword"
            },
            "val" : {
                "type" : "date",
                "format": "time_no_millis"
            }
        }
    }
}

Data

{ "index" : { "_id" : "1" } }
{"key": "null", "val": null}
{ "index" : { "_id" : "2" } }
{"key": "001: 00:00:00", "val": "08:00:00Z"}
{ "index" : { "_id" : "3" } }
{"key": "002: 00:00:01", "val": "08:00:01Z"}
{ "index" : { "_id" : "4" } }
{"key": "003: 01:00:00", "val": "09:00:00Z"}

Exception stack:

Error happened during query handling
java.lang.IllegalStateException: Construct ExprTimestampValue from "08:00:00Z" failed, unsupported date format.
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.constructTimestamp(OpenSearchExprValueFactory.java:185) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parseTimestamp(OpenSearchExprValueFactory.java:195) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parse(OpenSearchExprValueFactory.java:154) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.lambda$parseStruct$16(OpenSearchExprValueFactory.java:204) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at java.util.Iterator.forEachRemaining(Iterator.java:133) ~[?:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parseStruct(OpenSearchExprValueFactory.java:203) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.parse(OpenSearchExprValueFactory.java:149) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.construct(OpenSearchExprValueFactory.java:122) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.response.OpenSearchResponse.lambda$iterator$2(OpenSearchResponse.java:97) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
        at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958) ~[?:?]
        at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294) ~[?:?]
        at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206) ~[?:?]
        at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:169) ~[?:?]
        at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300) ~[?:?]
        at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) ~[?:?]
        at org.opensearch.sql.opensearch.storage.OpenSearchIndexScan.hasNext(OpenSearchIndexScan.java:90) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.executor.protector.ResourceMonitorPlan.hasNext(ResourceMonitorPlan.java:74) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.planner.physical.ProjectOperator.hasNext(ProjectOperator.java:51) ~[core-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.lambda$execute$0(OpenSearchExecutionEngine.java:39) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.client.OpenSearchNodeClient.schedule(OpenSearchNodeClient.java:157) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.execute(OpenSearchExecutionEngine.java:33) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.sql.SQLService.execute(SQLService.java:66) [sql-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.legacy.plugin.RestSQLQueryAction.lambda$prepareRequest$1(RestSQLQueryAction.java:123) [legacy-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.legacy.plugin.RestSqlAction.lambda$prepareRequest$1(RestSqlAction.java:162) [legacy-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.executor.Scheduler.lambda$withCurrentContext$0(Scheduler.java:30) [opensearch-2.4.0.0-SNAPSHOT.jar:?]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-2.4.0-SNAPSHOT.jar:2.4.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: java.time.format.DateTimeParseException: Text '08:00:00Z' could not be parsed, unparsed text found at index 8
        at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:2049) ~[?:?]
        at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1874) ~[?:?]
        at org.opensearch.sql.opensearch.data.value.OpenSearchExprValueFactory.constructTimestamp(OpenSearchExprValueFactory.java:182) ~[opensearch-2.4.0.0-SNAPSHOT.jar:?]
        ... 29 more

@MaxKsyunz
Copy link
Collaborator

MaxKsyunz commented Oct 18, 2022

Related issue -- #924. See this comment for root cause.

@Yury-Fridlyand
Copy link
Collaborator

@dai-chen
Copy link
Collaborator Author

dai-chen commented Nov 17, 2022

@Yury-Fridlyand @MaxKsyunz Here are some examples with which users had problem previously:

Index mapping:

PUT /_ingest/pipeline/test-pipeline
{
    "description" : "test-pipeline",
    "processors" : [
      {
        "date" : {
          "output_format" : "strict_date_optional_time",
          "ignore_failure" : false,
          "field" : "transactionDateTime",
          "target_field" : "transactionDateTime",
          "formats" : [
            "yyyy-MM-dd HH:mm:ss.SSS Z"
          ]
        }
      }
    ]
  }

PUT /_index_template/test
{
  "index_patterns" : [
    "test-*"
  ],
  "template": {
      "aliases": {
        "test":{}
      },
      "mappings": {
          "numeric_detection": true,
          "dynamic_date_formats": [
              "strict_date_optional_time",
              "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z||yyyy-MM-dd HH:mm:ss.SSS Z||strict_date_optional_time"
          ],
          "properties": {
              "transactionDateTime" : {
              "type" : "date"
            }
          }
      }
  }
}

Data:

POST /test-123/_doc
{
    "transactionDateTime": "2022-11-01 03:00:52.000 +0000"
}

Query:

SELECT * FROM test* where transactionDateTime = '2022-06-21 03:00:52.000'

@dai-chen
Copy link
Collaborator Author

@Yury-Fridlyand Are we planning to support this in 2.6.0 release once the PoC # 180 done?

@Yury-Fridlyand
Copy link
Collaborator

@dai-chen, yes

@andremacola
Copy link

I am having the same issue as the author after updating to 2.9.
Can't do any query with dates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working legacy Issues related to legacy query engine to be deprecated Priority-High SQL v2.9.0 v2.9.0
Projects
None yet
5 participants