-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-33469][SQL] Add current_timezone function #30400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Kubernetes integration test starting |
Test build #131229 has finished for PR 30400 at commit
|
Kubernetes integration test status success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @ulysses-you . Is the function name current_timezone
popular in the other DBMSs?
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
Show resolved
Hide resolved
@dongjoon-hyun yes, presto has this function. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #131247 has finished for PR 30400 at commit
|
Retest this please |
@@ -73,6 +73,21 @@ trait TimestampFormatterHelper extends TimeZoneAwareExpression { | |||
} | |||
} | |||
|
|||
@ExpressionDescription( | |||
usage = "_FUNC_() - Returns the current timezone.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to say that it's a session local timezone explicitly?
test("SPARK-33469: Add current_timezone function") { | ||
val df = Seq(1).toDF("c") | ||
val timezone = df.selectExpr("current_timezone()").collect().head.getString(0) | ||
assert(timezone == SQLConf.get.sessionLocalTimeZone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test, is it possible to set sessionLocalTimeZone
explicitly and check with it?
@@ -150,6 +150,7 @@ class ExpressionInfoSuite extends SparkFunSuite with SharedSparkSession { | |||
"org.apache.spark.sql.catalyst.expressions.CurrentDate", | |||
"org.apache.spark.sql.catalyst.expressions.CurrentTimestamp", | |||
"org.apache.spark.sql.catalyst.expressions.Now", | |||
"org.apache.spark.sql.catalyst.expressions.CurrentTimeZone", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add this after CurrentTimestamp
like the following?
"org.apache.spark.sql.catalyst.expressions.CurrentTimestamp",
"org.apache.spark.sql.catalyst.expressions.CurrentTimeZone",
"org.apache.spark.sql.catalyst.expressions.Now",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @cloud-fan and @MaxGekk
Kubernetes integration test starting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry about the motivations for adding the new function to public API.
Let user get current timezone easily.
At the moment, users can get the session time zone from the SQL config spark.sql.session.timeZone
. It doesn't seem to be so difficult, doesn't it?
Presto, SQL Server ...
As far as I know Spark doesn't declare to be compatible to those systems.
What is the real use case when the function is preferable over reading the SQL config?
Good point, @MaxGekk . Could you give us your use cases, @ulysses-you ? IMO, in the Spark SQL world (Spark Thrift Server or Spark SQL Shell), the following is the only way.
In other words, we cannot use it in Spark SQL DML statements. I guess you have some use cases for that. |
Kubernetes integration test status failure |
Test build #131435 has finished for PR 30400 at commit
|
@dongjoon-hyun @MaxGekk The difference between config and function, I think the later can calculate with other data. That says we can do some compare or save the data into storage engine. We introduce many date-related function in recent version, then the timezone is important.
ah, actually I have not seen some case with it and just consider we may need it sometime. We can also show the offset as @maropu said that make it different with |
Kubernetes integration test starting |
Kubernetes integration test status failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me and @ulysses-you know your final opinion, @maropu and @MaxGekk .
Although this looks good enough for me, I'll leave this PR open for your opinions. Please feel free to comments any final opinions.
Also, cc @cloud-fan .
Test build #131456 has finished for PR 30400 at commit
|
Could you rebase this, @ulysses-you ? There was a bug fix for GitHub Action Hive test failure in |
Since this feature might be useful for SQL users, adding it looks fine to me. |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #131496 has finished for PR 30400 at commit
|
Thank you, @ulysses-you and @maropu and @MaxGekk . |
thanks for merging ! |
What changes were proposed in this pull request?
Add a
CurrentTimeZone
function and replace the value atOptimizer
side.Why are the changes needed?
Let user get current timezone easily. Then user can call
Presto: https://prestodb.io/docs/current/functions/datetime.html
SQL Server: https://docs.microsoft.com/en-us/sql/t-sql/functions/current-timezone-transact-sql?view=sql-server-ver15
Does this PR introduce any user-facing change?
Yes, a new function.
How was this patch tested?
Add test.