Skip to content

[ES|QL] Date nanos implicit casting in union types option #2 #127797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

fang-xing-esql
Copy link
Member

This is an alternative approach to resolve #110009.

#123678 Does the implicit casting for union typed fields with date and date_nanos by adding an Eval to convert them to a date_nanos attribute.

This PR does the implicit casting for union typed fields with date and date_nanos by converting them to MultiTypeEsField in place in the EsRelation

fang-xing-esql and others added 30 commits February 25, 2025 20:17
A date in the valid date-nanos range should be acceptable to parse. This issue was showing up in a difference in behaviour between Elasticsearch ingest and the CsvTests. This fix makes CsvTests behave the same as Elasticearch itself.
@fang-xing-esql
Copy link
Member Author

Thank you for reviewing @alex-spies @costin !

The changes to Analyzer has been simplified and refactored. This approach converts union-typed fields with date and date_nanos types to MultiTypeEsField in place within the EsRelation without changing the actual field names, the predicates and filters on these MultiTypeEsFields are eligible for pushdown. The changes related to the pushdown, specifically for the cases where the name of a MultiTypeEsField differs from its underlying field name, due to explicit casting, have been removed from this PR, these scenarios can be addressed and optimized separately in a future PR.

@fang-xing-esql fang-xing-esql marked this pull request as ready for review May 27, 2025 13:50
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@fang-xing-esql fang-xing-esql added auto-backport Automatically create backport pull requests when merged v8.19.0 labels Jun 2, 2025
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @fang-xing-esql ! I have only a final set of remarks, but overall I think this is a very fine PR and let's get this in!

// Data type is different between implicit(date_nanos) and explicit casting, if the conversion is supported, create a
// new MultiTypeEsField with explicit casting type, and add it to unionFieldAttributes.
Set<DataType> supportedTypes = convert.supportedTypes();
if (supportedTypes.contains(fa.dataType())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this doesn't look fully correct. fa.dataType() for a field attribute containing a MultiTypeEsField is the type we (implicitly) cast to - but the conversion function needs to be compatible with all of the original types.

I realize that "all of the original types" in this case is limited to date + nano in practice and we shouldn't have conversion functions that take nanos but not date.

However, I think it'd be a tad nicer to actually check the original types so we're not building leaky assumptions into this code, which otherwise is agnostic of this detail and would still work if we expanded implicit casting to other type combinations.

Below, we already retrieve the original field from the map returned by MultiTypeEsField#getIndexToConversionExpressions. The original field in each entry should have the original type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I included this extra layer of checking, verifying whether the original types could be converted to the explicitly cast types. I later decided to simplify the logic and removed that step.

The main reason was based on a logical assumption: if A can be converted to B, and B can be converted to C, then A should be convertible to C—which reflects the kind of type conversions we're dealing with here.

A second reason, as you’ve noted, is that we're currently only supporting millis and nanos, so the simplified check is sufficient for now.

That said, it's debatable whether our current conversion functions reliably guarantee this transitive property in all cases. So perhaps it's worth reconsidering and reinstating that extra check to make the logic more robust.

Copy link
Contributor

@alex-spies alex-spies Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the transitivity assumption makes sense - I was thinking about cases where B just cannot be cast to C, though, but A may still be able to be cast to C.

Very theoretical as I don't have an example, and this doesn't apply to the current state.

As a nit: I'd slightly prefer to just fully stick with the union type logic, which is

  • "if all the original types can be cast to the explicitly required type, we'll treat this as a regular union type field and skip the implicit cast; otherwise, we cast implicitly during block loading and explicitly at runtime".

The current solution performs both checks, that is: "can both A be cast to C AND B be cast to C"? Which might have a veeeery small chance of doing an unintended double cast failing some otherwise valid queries some time in the future, or at least it's less obviously the same logic as for union types.

Absolutely not a blocker, of course.

@fang-xing-esql
Copy link
Member Author

Thanks a lot @fang-xing-esql ! I have only a final set of remarks, but overall I think this is a very fine PR and let's get this in!

Thanks for the thorough review @alex-spies ! The suggested tests are added and the validation of whether the original data types can be converted to the explicit types are added as well.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the iterations and dealing with the very subtle edge cases @fang-xing-esql ! This looks very good to me, let's get it in!

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fang-xing-esql fang-xing-esql added the ES|QL-ui Impacts ES|QL UI label Jun 5, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/kibana-esql (ES|QL-ui)

@fang-xing-esql fang-xing-esql merged commit 79e600a into elastic:main Jun 5, 2025
18 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 127797

@fang-xing-esql
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Jun 5, 2025
#129004)

* implicit casting for union typed fields mixed with datetime and date_nanos

(cherry picked from commit 79e600a)

# Conflicts:
#	x-pack/plugin/esql/qa/server/single-node/src/javaRestTest/java/org/elasticsearch/xpack/esql/qa/single_node/RestEsqlIT.java
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTestUtils.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java
fang-xing-esql added a commit that referenced this pull request Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ESQL] Support autocasting to resolve types between nanosecond dates and millisecond dates
6 participants