-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-29494][SQL] Fix for ArrayOutofBoundsException while converting string to timestamp #26143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @zsxwing |
@@ -456,6 +456,12 @@ class DateTimeUtilsSuite extends SparkFunSuite with Matchers { | |||
} | |||
} | |||
|
|||
test("trailing characters while converting string to timestamp") { | |||
val s = UTF8String.fromString("2019-10-31T10:59:23Z:::") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You described 2 cases in the PR discription ':'
and ' '
. This test checks the first one only. Could you add at least one more check for ' '
.
ok to test |
Actually, trailing spaces shouldn't cause any errors because spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala Line 221 in d11cbf2
|
} | ||
|
||
test("trailing space while converting string to timestamp") { | ||
val s = UTF8String.fromString("2019-10-31T10:59:23ZZ ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stringToTimestamp
returns None becase of invalid time zone ZZ
at the end. If you replace the string "2019-10-31T10:59:23ZZ "
by "2019-10-31T10:59:23Z "
, the functions shouldn't return None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right this works accurately for trailing spaces, I first enounctered this issue on an older release branch and it did not have the .trim
. But the fix should work for trailing spaces as well if backported.
Test build #112189 has finished for PR 26143 at commit
|
7f764d0
to
17edc79
Compare
Test build #112185 has finished for PR 26143 at commit
|
Test build #112186 has finished for PR 26143 at commit
|
Test build #112191 has finished for PR 26143 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
Outdated
Show resolved
Hide resolved
…/DateTimeUtils.scala Co-Authored-By: Maxim Gekk <max.gekk@gmail.com>
Test build #112291 has finished for PR 26143 at commit
|
… string to timestamp ### What changes were proposed in this pull request? * Adding an additional check in `stringToTimestamp` to handle cases where the input has trailing ':' * Added a test to make sure this works. ### Why are the changes needed? In a couple of scenarios while converting from String to Timestamp `DateTimeUtils.stringToTimestamp` throws an array out of bounds exception if there is trailing ':'. The behavior of this method requires it to return `None` in case the format of the string is incorrect. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Added a test in the `DateTimeTestUtils` suite to test if my fix works. Closes #26143 from rahulsmahadev/SPARK-29494. Lead-authored-by: Rahul Mahadev <rahul.mahadev@databricks.com> Co-authored-by: Rahul Shivu Mahadev <51690557+rahulsmahadev@users.noreply.github.com> Signed-off-by: Sean Owen <sean.owen@databricks.com> (cherry picked from commit 4cfce3e) Signed-off-by: Sean Owen <sean.owen@databricks.com>
Merged to master/2.4 |
@rahulsmahadev this broke 2.4. I just reverted it from branch-2.4. Could you submit a new PR against branch-2.4? Thanks! |
Shoot, thanks for catching that @zsxwing . Looks like some test utility code isn't present in 2.4:
|
What changes were proposed in this pull request?
stringToTimestamp
to handle cases where the input has trailing ':'Why are the changes needed?
In a couple of scenarios while converting from String to Timestamp
DateTimeUtils.stringToTimestamp
throws an array out of bounds exception if there is trailing ':'. The behavior of this method requires it to returnNone
in case the format of the string is incorrect.Does this PR introduce any user-facing change?
No
How was this patch tested?
Added a test in the
DateTimeTestUtils
suite to test if my fix works.