-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix datatype of case expression #5734
Conversation
@@ -68,7 +69,26 @@ impl ExprSchemable for Expr { | |||
Expr::OuterReferenceColumn(ty, _) => Ok(ty.clone()), | |||
Expr::ScalarVariable(ty, _) => Ok(ty.clone()), | |||
Expr::Literal(l) => Ok(l.get_datatype()), | |||
Expr::Case(case) => case.when_then_expr[0].1.get_type(schema), | |||
Expr::Case(case) => { | |||
// when #5681 will be fixed, this code can be reverted to: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about moving TypeCoersion to analyzer
to fix #5733 instead of making a temporary workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it's a duck-taping.
I'm afraid the mentioned issue won't be resolved quickly by cut'n'paste... (see #5681 (comment) )
So the question is - should we expose our users to the bug 🧐 until #5681 is fixed? (putting Datafusion's reputation at risk?)
What is more, this PR features sqlogictest ⭐️, so we won't have a regression caused by the fix #5681.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it would be best to have this code in Analyzer -- but if we can't get it in there quickly we can merge this PR as is.
Thank you for the test case @mslapek
Thank you for leaving the link to #5733 in the code
Do you know why the CREATE TABLE AS SELECT
does not have type coercion (as an optimizer pass) applied? It seems like that may be an oversight (and maybe the root cause of this problem)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type coercion is implemented as an optimizer - and it changes schemas of particular Expr
essions.
And it takes care of Expr::Case
.
But Expr::Projection
caches the schema in itself - so it is not updated! Therefore we get a mismatch between declared schema vs batch schema.
Essentially this PR makes Expr::Case
to not change a schema during the coercion optimizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FROM aggregate_test_100_by_sql | ||
|
||
|
||
query III |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly concerned if this test is actually needed, the PR is to fix schema mismatch on case statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test involves CASE statement, and it was crashing before the fix (the CREATE TABLE one).
The file itself might be useful for other null-related bugfixes.
Nevertheless there could exist better ways to test this... Suggestions welcome!
Sorry for the delay -- what I have been (secretly) hoping is that we can move the type coercion into an analyzer pass and avoid the need for this workaround. |
Here is what I think we should do to move this PR forward: I will file a ticket to remove this code from here (and into the analysis phase), and then we'll merge it in and we can do the cleanup with the analysis phase as a follow on |
FYI I tried the test case with the fix from @Jefffrey in #5820 and sadly it still fails I am merging this PR even though the code is non ideal -- my rationale is that it fixes a real bug, and has test coverage (with postgres verification). I have filed #5821 to fix it for real and will add that as a comment as a follow on PR |
Which issue does this PR close?
Fixes #5733
Rationale for this change
Fixes the issue.
What changes are included in this PR?
Make
Expr::schema
return coercedDataType
forExpr::Case
.Are these changes tested?
Added
pg_compat_null.slt
reproducing the issue.Are there any user-facing changes?
No, except the bug fix.