-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31874][SQL] Use FastDateFormat as the legacy fractional formatter
#28678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31874][SQL] Use FastDateFormat as the legacy fractional formatter
#28678
Conversation
|
Test build #123317 has finished for PR 28678 at commit
|
FastDateFormat as the legacy fractional formatterFastDateFormat as the legacy fractional formatter
|
@cloud-fan @HyukjinKwon @juliuszsompolski Please, review this PR. |
FastDateFormat as the legacy fractional formatterFastDateFormat as the legacy fractional formatter
FastDateFormat as the legacy fractional formatterFastDateFormat as the legacy fractional formatter
| TimestampFormatter.defaultPattern, | ||
| zoneId, | ||
| TimestampFormatter.defaultLocale, | ||
| LegacyDateFormats.FAST_DATE_FORMAT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it the same with Spark 2.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark 2.4 uses SimpleDateFormat but output of FastDateFormat and SimpleDateFormat for the concrete pattern yyyy-MM-dd HH:mm:ss is the same. At least, all our tests show no difference.
|
thanks, merging to master/3.0! |
…atter ### What changes were proposed in this pull request? 1. Replace `SimpleDateFormat` by `FastDateFormat` as the legacy formatter of `FractionTimestampFormatter`. 2. Optimise `LegacyFastTimestampFormatter` for `java.sql.Timestamp` w/o fractional part. ### Why are the changes needed? 1. By default `HiveResult`.`hiveResultString` retrieves timestamps values as instances of `java.sql.Timestamp`, and uses the legacy parser `SimpleDateFormat` to convert the timestamps to strings. After the fix #28024, the fractional formatter and its companion - legacy formatter `SimpleDateFormat` are created per every value. By switching from `LegacySimpleTimestampFormatter` to `LegacyFastTimestampFormatter`, we can utilize the internal cache of `FastDateFormat`, and avoid parsing the default pattern `yyyy-MM-dd HH:mm:ss`. 2. The second change in the method `def format(ts: Timestamp): String` of `LegacyFastTimestampFormatter` is needed to optimize the formatter for patterns without the fractional part and avoid conversions to microseconds. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By existing tests in `TimestampFormatter`. Closes #28678 from MaxGekk/fastdateformat-as-legacy-frac-formatter. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 47dc332) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
+1 |
1 similar comment
|
+1 |
What changes were proposed in this pull request?
SimpleDateFormatbyFastDateFormatas the legacy formatter ofFractionTimestampFormatter.LegacyFastTimestampFormatterforjava.sql.Timestampw/o fractional part.Why are the changes needed?
HiveResult.hiveResultStringretrieves timestamps values as instances ofjava.sql.Timestamp, and uses the legacy parserSimpleDateFormatto convert the timestamps to strings. After the fix [SPARK-31254][SQL] Use the current session time zone inHiveResult.toHiveString#28024, the fractional formatter and its companion - legacy formatterSimpleDateFormatare created per every value. By switching fromLegacySimpleTimestampFormattertoLegacyFastTimestampFormatter, we can utilize the internal cache ofFastDateFormat, and avoid parsing the default patternyyyy-MM-dd HH:mm:ss.def format(ts: Timestamp): StringofLegacyFastTimestampFormatteris needed to optimize the formatter for patterns without the fractional part and avoid conversions to microseconds.Does this PR introduce any user-facing change?
No
How was this patch tested?
By existing tests in
TimestampFormatter.