-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable decimal to decimal cast of different precision and scale #1086
base: main
Are you sure you want to change the base?
Conversation
@@ -895,6 +895,18 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlanHelper { | |||
} | |||
} | |||
|
|||
test("cast between decimals with different precision and scale") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see that some basic tests now pass. I assume there must have been improvements in DataFusion since we started this project.
I'd like to see the tests cover more scenarios, such as:
- Casting from smaller scale to larger scale e.g. (10, 1) to (10, 4)
- Tests for edge cases such as negative scale and zero scale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for more scenarios
- edge case inputs like null
- negative scale may not be allowed in Spark IIRC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m adding more cases, negative scale throws scale exception in spark as well.
// cast between default Decimal(38, 18) to Decimal(9,1) | ||
val values = Seq(BigDecimal("12345.6789"), BigDecimal("9876.5432"), BigDecimal("123.4567")) | ||
val df = withNulls(values).toDF("a") | ||
castTest(df, DataTypes.createDecimalType(7, 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original test case in the issue was casting to Decimal(6, 2)
instead of Decimal(7, 2)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the result of the test with (6,2) - on the right ( spark answer), the converted (precision,scale) is (7,2) for the last row and left side is matching with the issue description value. Should the left side produce 12345.68
instead of null?
== Results ==
!== Correct Answer - 4 == == Spark Answer - 4 ==
struct<a:decimal(38,18),a:decimal(6,2)> struct<a:decimal(38,18),a:decimal(6,2)>
[null,null] [null,null]
[123.456700000000000000,123.46] [123.456700000000000000,123.46]
[9876.543200000000000000,9876.54] [9876.543200000000000000,9876.54]
![12345.678900000000000000,null] [12345.678900000000000000,12345.68]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is confusing here, but the left side is Spark and the right side is Comet. Spark is producing null
but Comet is producing 12345.68
. We need to have Comet use the same logic as Spark here.
case (_: DecimalType, _: DecimalType) => | ||
// https://github.com/apache/datafusion-comet/issues/375 | ||
Incompatible() | ||
case (from: DecimalType, to: DecimalType) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
higher to lower precision conversion in Datafusion changing the integer part, hence marked it as incompatible().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change the PR description to part of
#375 instead of closing
...
@@ -231,11 +231,9 @@ abstract class CometTestBase | |||
df: => DataFrame): (Option[Throwable], Option[Throwable]) = { | |||
var expected: Option[Throwable] = None | |||
withSQLConf(CometConf.COMET_ENABLED.key -> "false") { | |||
val dfSpark = Dataset.ofRows(spark, df.logicalPlan) | |||
expected = Try(dfSpark.collect()).failed.toOption | |||
expected = Try(Dataset.ofRows(spark, df.logicalPlan).collect()).failed.toOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the plan parsing encounters a problem like negative precision
, this will catch that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the change here is just formatting: changing from 2 lines to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier only df.collect() part was inside the Try, I also included df.logicalPlan - this includes any exception during parsing, in this example, it was throwing parsing error for this
select a, cast(a as DECIMAL(10,-4)) from t order by a
Which issue does this PR close?
Part of #375
Rationale for this change
What changes are included in this PR?
How are these changes tested?