Skip to content

Conversation

Swiddis
Copy link
Collaborator

@Swiddis Swiddis commented May 8, 2025

Description

Another Legacy-V2 mismatch issue: Legacy doesn't apply type mappings that V2 applies, which causes inconsistency with the mapping described at https://github.com/opensearch-project/sql/blob/main/docs/user/general/datatypes.rst#data-types-mapping. This PR adds this mapping at the serialization stage, making the two engines return consistent data types.

Related Issues

Resolves #1545
Resolves #3159

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Swiddis added 5 commits May 6, 2025 20:27
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
{"login_time":"2015-01-01T12:10:30Z"}
{"index":{"_id":"3"}}
{"login_time":"1585882955"}
{"login_time":"1585882955000"}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found while writing the reproducers that this was getting parsed as milliseconds instead of seconds, despite the magnitude. That gets sent to January 19, 1970 instead of (probably intended) April 3, 2020. I decided to update the row.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Copy link
Collaborator

@RyanL1997 RyanL1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM.

JSONObject result = executeQuery(query);
JSONArray schema = result.getJSONArray("schema");

Assert.assertFalse(schema.isEmpty());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - Consider parameterizing the schema type check logic to reduce duplication:

private void assertSchemaTypeIsTimestamp(JSONArray schema) {
    Assert.assertFalse(schema.isEmpty());
    for (int i = 0; i < schema.length(); i++) {
        Assert.assertEquals("timestamp", schema.getJSONObject(i).getString("type"));
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not enough duplication here to warrant (this is a bit of a special case for type bugs), but the robust solution would be to make a custom schema assertion -- assertSchemaMatches(schema, ["timestamp", "timestamp"]). Then we throw it behind a custom assertion error that indicates the mismatching data types

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

penghuo
penghuo previously approved these changes Aug 20, 2025
RyanL1997
RyanL1997 previously approved these changes Aug 20, 2025
RyanL1997
RyanL1997 previously approved these changes Sep 17, 2025
Signed-off-by: Simeon Widdis <sawiddis@gmail.com>
@Swiddis Swiddis merged commit b049ac1 into opensearch-project:main Sep 30, 2025
33 checks passed
asifabashar added a commit to asifabashar/sql that referenced this pull request Oct 10, 2025
* main-apple: (218 commits)
  Add ignorePrometheus Flag for integTest and docTest (opensearch-project#4442)
  Create fab-radar.yml
  PPL `fillnull` command enhancement (opensearch-project#4421)
  reverting to _doc + _id (opensearch-project#4435)
  Support `multisearch` command in calcite (opensearch-project#4332)
  Add 3.3 release notes (opensearch-project#4422) (opensearch-project#4423)
  [SQL/PPL] Fix the `count(*)` and `dc(field)` to be capped at MAX_INTEGER opensearch-project#4416 (opensearch-project#4418)
  Change the default search sort tiebreaker to `_shard_doc` for PIT search (opensearch-project#4378)
  [Enhancement] Add error handling for known limitation of sql `JOIN` (opensearch-project#4344)
  Bugfix: SQL type mapping for legacy JDBC output (opensearch-project#3613)
  Version bump: 3.3 (opensearch-project#4417)
  Add max/min eval functions (opensearch-project#4333)
  Support time modifiers in search command  (opensearch-project#4224)
  Fix numbered token bug and make it optional output in patterns command (opensearch-project#4402)
  refactor span (opensearch-project#4334)
  Move release notes categories (opensearch-project#3818)
  [Doc] Enable doctest with Calcite (opensearch-project#4379)
  Mod function should return decimal instead of float when handle the operands are decimal literal (opensearch-project#4407)
  Scale of decimal literal should always be positive in Calcite (opensearch-project#4401)
  Enable Calcite by default and implicit fallback the unsupported commands (opensearch-project#4372)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working legacy Issues related to legacy query engine to be deprecated SQL v3.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] timestamp data type handling is inconsistent [BUG] Inconsistent behaviour for date field in nested collection queries

5 participants