-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-26503][CORE] Get rid of spark.sql.legacy.timeParser.enabled #23495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #100949 has finished for PR 23495 at commit
|
retest this please |
Test build #100955 has finished for PR 23495 at commit
|
@@ -1452,105 +1452,103 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData { | |||
} | |||
|
|||
test("backward compatibility") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember this test requires https://github.com/apache/spark/pull/23495/files#diff-7f589e01d3e5e5ea284c1622527d4984L85 . I don't think it is possible to pass the test on new parser.
Test build #100972 has finished for PR 23495 at commit
|
Merged to master |
…rser.enabled ## What changes were proposed in this pull request? The SQL config `spark.sql.legacy.timeParser.enabled` was removed by #23495. The PR cleans up the SQL migration guide and the comment for `UnixTimestamp`. Closes #23529 from MaxGekk/get-rid-off-legacy-parser-followup. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
## What changes were proposed in this pull request? Per discussion in apache#23391 (comment) this proposes to just remove the old pre-Spark-3 time parsing behavior. This is a rebase of apache#23411 ## How was this patch tested? Existing tests. Closes apache#23495 from srowen/SPARK-26503.2. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
…rser.enabled ## What changes were proposed in this pull request? The SQL config `spark.sql.legacy.timeParser.enabled` was removed by apache#23495. The PR cleans up the SQL migration guide and the comment for `UnixTimestamp`. Closes apache#23529 from MaxGekk/get-rid-off-legacy-parser-followup. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
// in the JSON object. | ||
// - For Spark before 1.5.1, we do not generate UDTs. So, we manually added the UDT value to | ||
// JSON objects generated by those Spark versions (col17). | ||
// - If the type is NullType, we do not write data out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason we removed this test case? This test case sounds very critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added in the PR #8806
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to capture all such changes before the final release of Spark 3.0. We can delay the release but we have to capture all such changes. Please let us know if you are aware of any similar change we made in the upcoming 3.0 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add more such test cases for ensuring we do not break the backward compatibility for all the built-in data sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is gone because the old behavior is gone; that's all that's going on here.
See the OP with a link to the actual change. The key discussions were:
#23391 (comment)
#23391 (comment)
I think the TL;DR is that the legacy behavior is error-prone and already susceptible to getting wrong answers for old dates. That seems worth 'fixing' despite the forced behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the latest discussions in #27710 (comment), we can't silently return the wrong results. The backward compatibility is very critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to @MaxGekk and @cloud-fan on that thread. It's trading one set of problems for another but it could be the right thing. We will never get rid of the legacy behavior now, I'm pretty sure :)
What changes were proposed in this pull request?
Per discussion in #23391 (comment) this proposes to just remove the old pre-Spark-3 time parsing behavior.
This is a rebase of #23411
How was this patch tested?
Existing tests.