Skip to content

[SPARK-41207][SQL] Fix BinaryArithmetic with negative scale #38739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

Make BinaryArithmetic result decimal type compatible with negative scale.

Why are the changes needed?

BinaryArithmetic adjust presicion and scale does not respect negative scale. Some operators may return a decimal type which presicion less than scale.

This pr fixs three things:

  1. work with divide, this is a long time bug. before the error msg:
-- 3.3
Caused by: java.lang.AssertionError: assertion failed
  at scala.Predef$.assert(Predef.scala:208)
  at org.apache.spark.sql.types.DecimalType$.adjustPrecisionScale(DecimalType.scala:183)
  at org.apache.spark.sql.catalyst.analysis.DecimalPrecision$$anonfun$decimalAndDecimal$1.applyOrElse(DecimalPrecision.scala:145)
  at org.apache.spark.sql.catalyst.analysis.DecimalPrecision$$anonfun$decimalAndDecimal$1.applyOrElse(DecimalPrecision.scala:94)
  at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CombinedTypeCoercionRule.$anonfun$transform$3(TypeCoercion.scala:178)
  at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
  at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
  at scala.collection.immutable.List.foldLeft(List.scala:91)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CombinedTypeCoercionRule.$anonfun$transform$2(TypeCoercion.scala:177)

-- 3.4
Caused by: java.lang.AssertionError: assertion failed
  at scala.Predef$.assert(Predef.scala:208)
  at org.apache.spark.sql.types.DecimalType$.adjustPrecisionScale(DecimalType.scala:184)
  at org.apache.spark.sql.catalyst.expressions.Divide.resultDecimalType(arithmetic.scala:802)
  at org.apache.spark.sql.catalyst.expressions.BinaryArithmetic.dataType(arithmetic.scala:238)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$Division$$anonfun$3.applyOrElse(TypeCoercion.scala:515)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$Division$$anonfun$3.applyOrElse(TypeCoercion.scala:509)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CombinedTypeCoercionRule.$anonfun$transform$3(TypeCoercion.scala:190)
  at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
  at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
  at scala.collection.immutable.List.foldLeft(List.scala:91)
  at org.apache.spark.sql.catalyst.analysis.TypeCoercionBase$CombinedTypeCoercionRule.$anonfun$transform$2(TypeCoercion.scala:189)
  1. fix IntegralDivide can not work with 3.4 :
org.apache.spark.sql.AnalysisException: Decimal scale (0) cannot be greater than precision (-4).
  at org.apache.spark.sql.errors.QueryCompilationErrors$.decimalCannotGreaterThanPrecisionError(QueryCompilationErrors.scala:2237)
  at org.apache.spark.sql.types.DecimalType.<init>(DecimalType.scala:49)
  at org.apache.spark.sql.types.DecimalType$.bounded(DecimalType.scala:164)
  at org.apache.spark.sql.catalyst.expressions.IntegralDivide.resultDecimalType(arithmetic.scala:868)
  1. correct the result for some operators which do not fail. e.g. remainder if right side precision bigger than 38

Does this PR introduce any user-facing change?

yes, bug fix when enable spark.sql.legacy.allowNegativeScaleOfDecimal

How was this patch tested?

add test

@github-actions github-actions bot added the SQL label Nov 21, 2022
case (DecimalType.Fixed(p1, s1), DecimalType.Fixed(p2, s2)) =>
case (DecimalType.Fixed(_p1, _s1), DecimalType.Fixed(_p2, _s2)) =>
// compatible with negative scale
val (p1, s1, p2, s2) = if (_s1 < 0 && _s2 < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: are there any other methods that use ·_xxx for local vars name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, changed the name

@ulysses-you
Copy link
Contributor Author

@@ -3532,6 +3532,49 @@ class DataFrameSuite extends QueryTest
}.isEmpty)
}
}

test("SPARK-41207: Fix BinaryArithmetic with negative scale") {
withSQLConf(SQLConf.LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key -> "true") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add new features for a legacy config, unless this is a regression. Can you clarify which commit caused the regression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are external reasons to not support negative scale decimal values. ANSI SQL disallows it, which is why SPARK-30252 turned it off by default. Parquet does not support it, which by the way that appears to be a bug in ParquetTable where it says that they are supported. I'll file an issue for it.

To me there are two choices we either need to support it fully and start to work through all of the issues and corner cases to make them work all the time, or we need to deprecate them and remove them. It has been three years I think we can move from legacy to deprecated. Having a "solution" with land mines hidden in it and only a config to protect you is not a good solution at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan, so far it seems the only regression is IntegralDivide which is affected by #36698. The other issues live long time. I think the root reason for these issues are same which is fixed by this pr. I'm not sure it is a kind of feature, but some correction for legacy issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this fixes all issues. Officially supporting negative scale is really a hard problem. Do you really turn on this config in production?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear we don't have it on in production. We just noticed it when trying to match the functionality changes after #36698 in the RAPIDs Accelerator for Apache Spark and filed SPARK-41207. Most of the changes were small and logical, like when the output precision would be larger than 38 Spark will now round after doing an add instead of adjusting scale and rounding LHS and RHS before doing the add. This is one place that is arguably a regression and we wanted to be sure it was documented.

Putting on my Spark contributor hat now, negative scale support is off by default as a legacy feature, but not deprecated. I am of the opinion that we should not support something half way. If add only worked for positive numbers we would fix it right away. But from what I have seen negative scale decimal is not widely used and "Officially supporting negative scale is really a hard problem". As such I would like to see us deprecate support for it, and not fix bugs that come up.

If on the other hand if there are enough contributors that want or need support for negative scale decimal, then they should come up with a plan to get it to a good place.

@@ -3532,6 +3532,49 @@ class DataFrameSuite extends QueryTest
}.isEmpty)
}
}

test("SPARK-41207: Fix BinaryArithmetic with negative scale") {
withSQLConf(SQLConf.LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key -> "true") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add new features for a legacy config, unless this is a regression. Can you clarify which commit caused the regression?

checkType(Multiply(a, b), DecimalType(5, -11))
checkType(Multiply(a, c), DecimalType(38, -9))
checkType(Multiply(b, c), DecimalType(37, 0))
checkType(Multiply(a, b), DecimalType(16, 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only regression is for IntegralDivide then why are there changes to the output type of Multiply? Even more so when this test was not modified as a part of #36698? It feels like you are making much wider changes than just fixing IntegralDivide for negative scale decimal and I really would like to understand why that is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this pr, I changed all BinaryArithmetic with negative scale. These test changes you see are actually the legacy issues since we support negative scale, not related #36698. I really can not understand why decimal(38, -9) can appear in Spark, it means overflow.
The regression of IntegralDivide is caused by #36698. We can move code to IntegralDivide if we want to narrow the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the long explanation. I am not sure how to make it any shorter.

By definition a decimal multiply will add the scales. https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html there is no special case for negative scale vs positive scale. It adds them.

scala> val lhs = BigDecimal("123E10")
lhs: scala.math.BigDecimal = 1.23E+12

scala> lhs.scale
res0: Int = -10

scala> val rhs = BigDecimal("9E1")
rhs: scala.math.BigDecimal = 9E+1

scala> rhs.scale
res1: Int = -1

scala> val ret = rhs * lhs
ret: scala.math.BigDecimal = 1.107E+14

scala> ret.scale
res2: Int = -11

By definition the resulting precision of a multiply can at most be lhs.precision + rhs.precision + 1. Again there is no call out for negative scale or positive scale. This is in the SQL standard and also how decimal math works. The problem is that you are resetting the values to have a scale of 0, and effectively changing the precision of the result. This does not follow the standard rules for decimal operations. You are effectively increasing the precision of the value. You get an equivalent answer, but the cost of storing the result is now much greater. So much so that many results will not fit and result in an overflow when they could have fit if the result had a negative scale.

I am -1 on this change as is. It is wrong to modify multiply in this way.

Divide is technically wrong in a number of ways with negative scale decimal values even before 3.4.0. The scale of a divide is LHS.scale - RHS.scale. This includes negative scale values. There is technically a great deal of scale loss when doing a decimal divide. Spark follows Hive and most other SQL implementations by having at least a scale of 6 in the output of the divide, and some complicated math to calculate the output precision that goes with it. But that does not deal with negative values cleanly. It would be really nice to know what Hive does in these cases, or what MsSQL does, or really anything that supports negative scale. Do they all return errors? What type do they return if it is not an error? It appears that the SQL spec itself has the bug in it, and I am not inclined to "fix" the spec without some understanding of what others are doing too.

The reason IntegralDivide is failing now, where it didn't before, is because the output result was never checked for overflow before being converted into a Long. IntegralDivide returns a Long. The overflow check in 3.4.0 that happens after the divide is totally skipped in previous versions because the output is a long. Look at the rule in DecimalPrecision

if (expr.dataType.isInstanceOf[DecimalType]) {
// This follows division rule
val intDig = p1 - s1 + s2
// No precision loss can happen as the result scale is 0.
// Overflow can happen only in the promote precision of the operands, but if none of them
// overflows in that phase, no overflow can happen, but CheckOverflow is needed in order
// to return a decimal with the proper scale and precision
CheckOverflow(promotedExpr, DecimalType.bounded(intDig, 0), nullOnOverflow)
} else {
promotedExpr
}

If the output type is not a decimal do nothing. If you want to have IntegralDivide act the same as before you need to just remove the overflow check in it. I am not sure if that is a good thing or not.

Copy link
Contributor Author

@ulysses-you ulysses-you Nov 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you found. I think the reason why Divide and IntegralDivide fail is simple. SQL strandard does not allow negative scale, but we use its definition formula to calculate the result precision and scale. Then the result precision can be negative which is unexpected. So I think other binary arithmetic also should not follow if scale is negative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are saying is that you want to remove negative scale decimal values by turning them into 0 scale decimal values because the SQL standard does not allow for negative scale decimal, but only for binary math expressions. Why not everywhere? Why should abs still return a negative scale decimal? Why should cast allow us to cast to a negative scale decimal? How is that better than removing support for negative scale decimal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this behavior follows what spark.sql.legacy.allowNegativeScaleOfDecimal.enabled said:

val LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED =
buildConf("spark.sql.legacy.allowNegativeScaleOfDecimal")
.internal()
.doc("When set to true, negative scale of Decimal type is allowed. For example, " +
"the type of number 1E10BD under legacy mode is DecimalType(2, -9), but is " +
"Decimal(11, 0) in non legacy mode.")

As I mentioned, I'm ok to only fix the regression with IntegralDivide by changing negative scale to 0, and leave others. The cost is huge to change the behavior of all expression which support decimal type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this going in just for IntegralDivide, and assuming that others are okay with it for regular Divide too. I think the others should be left as is, at least without more discussion from others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan what do you think ? only fix the regression and leave others as is.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Mar 10, 2023
@github-actions github-actions bot closed this Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants