-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-16078] [SQL] from_utc_timestamp/to_utc_timestamp should not depends on local timezone #13784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #60869 has finished for PR 13784 at commit
|
cc @hvanhovell |
val tzClass = classOf[TimeZone].getName | ||
ctx.addMutableState(tzClass, tzTerm, s"""$tzTerm = $tzClass.getTimeZone("$tz");""") | ||
ctx.addMutableState(tzClass, utcTerm, s"""$utcTerm = $tzClass.getTimeZone("GMT");""") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UTC? Universal Time Coordinated and Greenwich Mean Time are in practice the same (GMT is a timezone, UTC is not); but lets use one for consistency.
Test build #60888 has finished for PR 13784 at commit
|
val tz = TimeZone.getTimeZone(timeZone) | ||
val offset = tz.getOffset(time / 1000L) | ||
time + offset * 1000L | ||
convertTz(time, TimeZoneGMT, TimeZone.getTimeZone(timeZone)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For fromUTCTime
, this would result in a little bit overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets try to make it correct first. More optimizations are always possible.
Test build #60890 has finished for PR 13784 at commit
|
LGTM - thanks! Merging to master/2.0 |
…ends on local timezone ## What changes were proposed in this pull request? Currently, we use local timezone to parse or format a timestamp (TimestampType), then use Long as the microseconds since epoch UTC. In from_utc_timestamp() and to_utc_timestamp(), we did not consider the local timezone, they could return different results with different local timezone. This PR will do the conversion based on human time (in local timezone), it should return same result in whatever timezone. But because the mapping from absolute timestamp to human time is not exactly one-to-one mapping, it will still return wrong result in some timezone (also in the begging or ending of DST). This PR is kind of the best effort fix. In long term, we should make the TimestampType be timezone aware to fix this totally. ## How was this patch tested? Tested these function in all timezone. Author: Davies Liu <davies@databricks.com> Closes #13784 from davies/convert_tz. (cherry picked from commit 20d411b) Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
…ld not depends on local timezone ## What changes were proposed in this pull request? Back-port of #13784 to `branch-1.6` ## How was this patch tested? Existing tests. Author: Davies Liu <davies@databricks.com> Closes #15554 from srowen/SPARK-16078.
…ld not depends on local timezone ## What changes were proposed in this pull request? Back-port of apache#13784 to `branch-1.6` ## How was this patch tested? Existing tests. Author: Davies Liu <davies@databricks.com> Closes apache#15554 from srowen/SPARK-16078. (cherry picked from commit 82e98f1)
What changes were proposed in this pull request?
Currently, we use local timezone to parse or format a timestamp (TimestampType), then use Long as the microseconds since epoch UTC.
In from_utc_timestamp() and to_utc_timestamp(), we did not consider the local timezone, they could return different results with different local timezone.
This PR will do the conversion based on human time (in local timezone), it should return same result in whatever timezone. But because the mapping from absolute timestamp to human time is not exactly one-to-one mapping, it will still return wrong result in some timezone (also in the begging or ending of DST).
This PR is kind of the best effort fix. In long term, we should make the TimestampType be timezone aware to fix this totally.
How was this patch tested?
Tested these function in all timezone.