-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-33468][SQL] ParseUrl in ANSI mode should fail if input string is not a valid url #30399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
case e: URISyntaxException => null | ||
// We fail on error if in ansi mode. | ||
case e: URISyntaxException if SQLConf.get.ansiEnabled => | ||
throw new IllegalArgumentException(s"Find an invaild url string ${url.toString}", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the raw exception message is something like Illegal character in query at index 33: https://a.b.c/index.php?params1=a|b¶ms2=x
.
@@ -1404,7 +1404,10 @@ case class ParseUrl(children: Seq[Expression]) | |||
try { | |||
new URI(url.toString) | |||
} catch { | |||
case e: URISyntaxException => null | |||
// We fail on error if in ansi mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this comment.
@@ -943,6 +943,21 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { | |||
GenerateUnsafeProjection.generate(ParseUrl(Seq(Literal("\"quote"), Literal("\"quote"))) :: Nil) | |||
} | |||
|
|||
test("SPARK-33468: ParseUrl should fail if input string is not a valid url") { | |||
// fail on error if in ansi mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this.
@@ -943,6 +943,21 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { | |||
GenerateUnsafeProjection.generate(ParseUrl(Seq(Literal("\"quote"), Literal("\"quote"))) :: Nil) | |||
} | |||
|
|||
test("SPARK-33468: ParseUrl should fail if input string is not a valid url") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseUrl
-> ParseUrl in ANSI mode
?
Kubernetes integration test starting |
@dongjoon-hyun thanks for the review. Updated the comment. |
Kubernetes integration test status failure |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #131226 has finished for PR 30399 at commit
|
cc @cloud-fan and @maropu |
Test build #131227 has finished for PR 30399 at commit
|
@@ -1404,7 +1404,9 @@ case class ParseUrl(children: Seq[Expression]) | |||
try { | |||
new URI(url.toString) | |||
} catch { | |||
case e: URISyntaxException => null | |||
case e: URISyntaxException if SQLConf.get.ansiEnabled => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the ANSI doc, too. https://github.com/apache/spark/blame/master/docs/sql-ref-ansi-compliance.md#L110
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed here, updated !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we follow other expressions and add a failOnError: Boolean
parameter? See #30297 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get it !
Looks fine otherwise. |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #131245 has finished for PR 30399 at commit
|
docs/sql-ref-ansi-compliance.md
Outdated
@@ -114,6 +114,7 @@ The behavior of some SQL functions can be different under ANSI mode (`spark.sql. | |||
- `element_at`: This function throws `ArrayIndexOutOfBoundsException` if using invalid indices. | |||
- `element_at`: This function throws `NoSuchElementException` if key does not exist in map. | |||
- `elt`: This function throws `ArrayIndexOutOfBoundsException` if using invalid indices. | |||
- `parse_url`: This function throws `IllegalArgumentException` if input string is not a valid url. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: input string
-> an input string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry missed here. Updated !
Looks ok otherwise. |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #131377 has finished for PR 30399 at commit
|
Kubernetes integration test starting |
Test build #131387 has finished for PR 30399 at commit
|
Kubernetes integration test status success |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #131391 has finished for PR 30399 at commit
|
thanks, merging to master! |
thanks for merging ! |
What changes were proposed in this pull request?
With
ParseUrl
, instead of return null we throw exception if input string is not a vaild url.Why are the changes needed?
For ANSI mode.
Does this PR introduce any user-facing change?
Yes, user will get exception if
set spark.sql.ansi.enabled=true
.How was this patch tested?
Add test.