-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation #34681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Kubernetes integration test starting |
Kubernetes integration test status failure |
sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
Show resolved
Hide resolved
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
Show resolved
Hide resolved
Test build #145508 has finished for PR 34681 at commit
|
@@ -348,6 +290,9 @@ object AnsiTypeCoercion extends TypeCoercionBase { | |||
// Skip nodes who's children have not been resolved yet. | |||
case e if !e.childrenResolved => e | |||
|
|||
case d @ DateAdd(AnyTimestampType(), _) => d.copy(startDate = Cast(d.startDate, DateType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should refactor these functions to extend ImplicitCastInputTypes
later
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #145513 has finished for PR 34681 at commit
|
Signed-off-by: Karen Feng <karen.feng@databricks.com>
Re-generate golden files
Kubernetes integration test starting |
Kubernetes integration test status failure |
Merging to master |
Test build #145528 has finished for PR 34681 at commit
|
What changes were proposed in this pull request?
Under ANSI mode(spark.sql.ansi.enabled=true), the function invocation of Spark SQL:
Store assignment
rules as storing the input values as the declared parameter type of the SQL functionsWhy are the changes needed?
Currently, the ANSI SQL mode resolves the function invocation with

Least Common Type Resolution
based onType precedence list
. After a closer look at the ANSI SQL standard, the "store assignment" syntax rules should be used for resolving the type coercion between the input and parameters of SQL function, while theType precedence list
is used for "Subject routine determination"(SQL function overloads).I have also done some data science among real-world SQL queries, the following implicit function casts are not allowed as per
Least Common Type Resolution
but they are commonly seen:CONCAT(DATE_ADD(%1, CAST(%2 AS INT)), SUBSTR(CAST(%1 AS TIMESTAMP), 11)) AS TIMESTAMP)
date_sub(now(), 7) < ...
from_unixtime(updated/1000)
, note thatupdated
and1000
will be converted as Double first.The changes in this PR is ANSI compatible and it is good for the adoption of ANSI SQL mode.
Does this PR introduce any user-facing change?
Yes, Use store assignment rules for resolving function invocation under ANSI mode.
How was this patch tested?
Unit tests