Skip to content

[SPARK-29494][SQL] Fix for ArrayOutofBoundsException while converting string to timestamp #26143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

rahulsmahadev
Copy link
Contributor

@rahulsmahadev rahulsmahadev commented Oct 16, 2019

What changes were proposed in this pull request?

  • Adding an additional check in stringToTimestamp to handle cases where the input has trailing ':'
  • Added a test to make sure this works.

Why are the changes needed?

In a couple of scenarios while converting from String to Timestamp DateTimeUtils.stringToTimestamp throws an array out of bounds exception if there is trailing ':'. The behavior of this method requires it to return None in case the format of the string is incorrect.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added a test in the DateTimeTestUtils suite to test if my fix works.

@rahulsmahadev
Copy link
Contributor Author

cc @zsxwing

@@ -456,6 +456,12 @@ class DateTimeUtilsSuite extends SparkFunSuite with Matchers {
}
}

test("trailing characters while converting string to timestamp") {
val s = UTF8String.fromString("2019-10-31T10:59:23Z:::")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You described 2 cases in the PR discription ':' and ' '. This test checks the first one only. Could you add at least one more check for ' '.

@zsxwing
Copy link
Member

zsxwing commented Oct 16, 2019

ok to test

@MaxGekk
Copy link
Member

MaxGekk commented Oct 16, 2019

Actually, trailing spaces shouldn't cause any errors because stringToTimestamp trims its input

}

test("trailing space while converting string to timestamp") {
val s = UTF8String.fromString("2019-10-31T10:59:23ZZ ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stringToTimestamp returns None becase of invalid time zone ZZ at the end. If you replace the string "2019-10-31T10:59:23ZZ " by "2019-10-31T10:59:23Z ", the functions shouldn't return None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right this works accurately for trailing spaces, I first enounctered this issue on an older release branch and it did not have the .trim . But the fix should work for trailing spaces as well if backported.

@SparkQA
Copy link

SparkQA commented Oct 16, 2019

Test build #112189 has finished for PR 26143 at commit 7f764d0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 16, 2019

Test build #112185 has finished for PR 26143 at commit 17edc79.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 16, 2019

Test build #112186 has finished for PR 26143 at commit 421138f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 17, 2019

Test build #112191 has finished for PR 26143 at commit 17edc79.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

…/DateTimeUtils.scala

Co-Authored-By: Maxim Gekk <max.gekk@gmail.com>
@SparkQA
Copy link

SparkQA commented Oct 18, 2019

Test build #112291 has finished for PR 26143 at commit 5ef53b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in 4cfce3e Oct 18, 2019
srowen pushed a commit that referenced this pull request Oct 18, 2019
… string to timestamp

### What changes were proposed in this pull request?
* Adding an additional check in `stringToTimestamp` to handle cases where the input has trailing ':'
* Added a test to make sure this works.

### Why are the changes needed?
In a couple of scenarios while converting from String to Timestamp `DateTimeUtils.stringToTimestamp` throws an array out of bounds exception if there is trailing  ':'. The behavior of this method requires it to return `None` in case the format of the string is incorrect.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Added a test in the `DateTimeTestUtils` suite to test if my fix works.

Closes #26143 from rahulsmahadev/SPARK-29494.

Lead-authored-by: Rahul Mahadev <rahul.mahadev@databricks.com>
Co-authored-by: Rahul Shivu Mahadev <51690557+rahulsmahadev@users.noreply.github.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
(cherry picked from commit 4cfce3e)
Signed-off-by: Sean Owen <sean.owen@databricks.com>
@srowen
Copy link
Member

srowen commented Oct 18, 2019

Merged to master/2.4

@zsxwing
Copy link
Member

zsxwing commented Oct 18, 2019

@rahulsmahadev this broke 2.4. I just reverted it from branch-2.4. Could you submit a new PR against branch-2.4? Thanks!

@srowen
Copy link
Member

srowen commented Oct 19, 2019

Shoot, thanks for catching that @zsxwing . Looks like some test utility code isn't present in 2.4:

[error] /home/jenkins/workspace/spark-branch-2.4-test-maven-hadoop-2.7/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala:586: not found: value defaultZoneId
[error]     val time = DateTimeUtils.stringToTimestamp(s, defaultZoneId)
[error]                                                   ^
[error] one error found
[error] Compile failed at Oct 18, 2019 2:55:40 PM [15.246s]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants