fix: Enhance case expression type coercion#5820
Conversation
| // ELSE b3 | ||
| // END | ||
| // | ||
| // Then all aN (a1, a2, a3) must be converted to a common data type in the first example |
| // prepare types | ||
| let case_type = match &case.expr { | ||
| None => Ok(None), | ||
| Some(expr) => expr.get_type(&self.schema).map(Some), | ||
| }?; |
There was a problem hiding this comment.
You can write this more concisely / functionally using transpose like this if you want
| // prepare types | |
| let case_type = match &case.expr { | |
| None => Ok(None), | |
| Some(expr) => expr.get_type(&self.schema).map(Some), | |
| }?; | |
| // prepare types | |
| let case_type = case.expr.as_ref() | |
| .map(|expr| expr.get_type(&self.schema)) | |
| .transpose()?; |
The same basic transformation applies to the other match statements below -- I don't think it is a big deal I just figured I would point it out to you
There was a problem hiding this comment.
Thanks that does look cleaner, will apply it
| # select case when type coercion | ||
| query I | ||
| select CASE 10.5 WHEN 0 THEN 1 ELSE 10 END as col; | ||
| ---- | ||
| 10 |
There was a problem hiding this comment.
Could you add some additional tests?
- Alternate Syntax:
select CASE
WHEN 10 = 5 THEN 1
WHEN 'true' THEN 1
ELSE 10
END as col;
Also negative cases like
select CASE 10.5 WHEN 'not a number' THEN 1 ELSE 10 END as col;select CASE
WHEN 10 = 5 THEN 1
WHEN 'not boolean' THEN 1
ELSE 10
END as col;select CASE
WHEN 10 = 5 THEN 1
WHEN 5 THEN 'not a number'
ELSE 10
END as col;select CASE
WHEN 10 = 5 THEN 1
WHEN 5 THEN 0
ELSE 'goo'
END as col;There was a problem hiding this comment.
Is there a way in sqllogictests to change the config of DataFusion? Since for negative cases to produce error, would need to set datafusion.optimizer.skip_failed_rules to false, unless you mean for the negative case to fail as in the original issue?
Otherwise can add the negative case as a unit test in actual code
There was a problem hiding this comment.
Yes, you can use the SET command to change config values. For example https://github.com/apache/arrow-datafusion/blob/7b90474adf6c1afbb4a5988a5677834b07347321/datafusion/core/tests/sqllogictests/test_files/information_schema.slt#L30
There was a problem hiding this comment.
Testing with sqllogictests turned out to be kinda finicky, as this PR is meant to enhance the type coercion, but not actually testing the actual casting itself (since some of those negative cases would pass the type coercion but fail at execution time due to actual contents of the data). So I figured it'd be easier to add more tests as unit tests for proper testing.
|
I've found another bug related to case expression casting, will raise issue and also fix as part of this PR since will be affecting similar code Edit: the bug #5828 Edit2: fix added |
alamb
left a comment
There was a problem hiding this comment.
I just think we need a few more tests and this PR will be ready to go
| let when_value = as_boolean_array(&when_value) | ||
| .expect("WHEN expression did not return a BooleanArray"); | ||
| let when_value = as_boolean_array(&when_value).map_err(|e| { | ||
| DataFusionError::Context( |
There was a problem hiding this comment.
Yeah figured better to wrap in Context to hopefully provide more informative error message
| .map(|expr| expr.get_type(&schema)) | ||
| .transpose()?; | ||
|
|
||
| // find common coercible types |
There was a problem hiding this comment.
I found this code really easy to read. Thank you @Jefffrey
Which issue does this PR close?
Closes #5538
Closes #5828
Rationale for this change
Given following:
Should attempt to coerce
a,bandcto common type.And for:
Should coerce
aandbto boolean type.What changes are included in this PR?
Ensure
caseandwhenexpressions are coerced to common type, similar to how currentlythenandelseexpressions are coerced to common type.Also ensure
whenexpressions for whencaseexpression isn't provided to coerce to boolean type.Are these changes tested?
Added to sqllogictest
Are there any user-facing changes?