-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-48498][SQL][FOLLOWUP] do padding for char-char comparison #47412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @yaooqinn |
sql(s"CREATE TABLE t2 (c1 CHAR(3), c2 CHAR(5)) USING $format LOCATION '$dir'") | ||
// Comparing CHAR column with CHAR column compares the padded values. | ||
checkAnswer( | ||
sql("SELECT c1 = c2 FROM t2"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test both c1 = c2 and c2 = c1?
Seq(Row(true), Row(true), Row(true), Row(true)) | ||
) | ||
checkAnswer( | ||
sql("SELECT c1 IN (c2) FROM t2"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
(rawType, typeWithTargetCharLength) match { | ||
case (CharType(len), CharType(target)) if target > len => | ||
case (CharType(len), CharType(target)) if alwaysPad || target > len => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need padding if len = target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to. This is the existing logic. My change is always pad as the CHAR value may not be padded to its declared length.
(rawType, typeWithTargetCharLength) match { | ||
case (CharType(len), CharType(target)) if target > len => | ||
case (CharType(len), CharType(target)) if alwaysPad || target > len => | ||
Some(StringRPad(expr, Literal(target))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall be the bigger number here?
thanks for review, merging to master/3.5! |
### What changes were proposed in this pull request? This is a followup of #46832 to handle a missing case: char-char comparison. We should pad both sides if `READ_SIDE_CHAR_PADDING` is not enabled. ### Why are the changes needed? bug fix if people disable read side char padding ### Does this PR introduce _any_ user-facing change? No because it's a followup and the original PR is not released yet ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #47412 from cloud-fan/char. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? This is a followup of apache#46832 to handle a missing case: char-char comparison. We should pad both sides if `READ_SIDE_CHAR_PADDING` is not enabled. ### Why are the changes needed? bug fix if people disable read side char padding ### Does this PR introduce _any_ user-facing change? No because it's a followup and the original PR is not released yet ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47412 from cloud-fan/char. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? This is a followup of apache#46832 to handle a missing case: char-char comparison. We should pad both sides if `READ_SIDE_CHAR_PADDING` is not enabled. ### Why are the changes needed? bug fix if people disable read side char padding ### Does this PR introduce _any_ user-facing change? No because it's a followup and the original PR is not released yet ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47412 from cloud-fan/char. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? This is a followup of apache#46832 to handle a missing case: char-char comparison. We should pad both sides if `READ_SIDE_CHAR_PADDING` is not enabled. ### Why are the changes needed? bug fix if people disable read side char padding ### Does this PR introduce _any_ user-facing change? No because it's a followup and the original PR is not released yet ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47412 from cloud-fan/char. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This is a followup of #46832 to handle a missing case: char-char comparison. We should pad both sides if
READ_SIDE_CHAR_PADDING
is not enabled.Why are the changes needed?
bug fix if people disable read side char padding
Does this PR introduce any user-facing change?
No because it's a followup and the original PR is not released yet
How was this patch tested?
new tests
Was this patch authored or co-authored using generative AI tooling?
no