Skip to content

Commit d0990e3

Browse files
cloud-fangatorsmile
authored andcommitted
[SPARK-25454][SQL] add a new config for picking minimum precision for integral literals
## What changes were proposed in this pull request? #20023 proposed to allow precision lose during decimal operations, to reduce the possibilities of overflow. This is a behavior change and is protected by the DECIMAL_OPERATIONS_ALLOW_PREC_LOSS config. However, that PR introduced another behavior change: pick a minimum precision for integral literals, which is not protected by a config. This PR add a new config for it: `spark.sql.literal.pickMinimumPrecision`. This can allow users to work around issue in SPARK-25454, which is caused by a long-standing bug of negative scale. ## How was this patch tested? a new test Closes #22494 from cloud-fan/decimal. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>
1 parent 51540c2 commit d0990e3

File tree

3 files changed

+24
-4
lines changed

3 files changed

+24
-4
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -290,11 +290,13 @@ object DecimalPrecision extends TypeCoercionRule {
290290
// potentially loosing 11 digits of the fractional part. Using only the precision needed
291291
// by the Literal, instead, the result would be DECIMAL(38 + 1 + 1, 18), which would
292292
// become DECIMAL(38, 16), safely having a much lower precision loss.
293-
case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType]
294-
&& l.dataType.isInstanceOf[IntegralType] =>
293+
case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
294+
l.dataType.isInstanceOf[IntegralType] &&
295+
SQLConf.get.literalPickMinimumPrecision =>
295296
b.makeCopy(Array(Cast(l, DecimalType.fromLiteral(l)), r))
296-
case (l, r: Literal) if l.dataType.isInstanceOf[DecimalType]
297-
&& r.dataType.isInstanceOf[IntegralType] =>
297+
case (l, r: Literal) if l.dataType.isInstanceOf[DecimalType] &&
298+
r.dataType.isInstanceOf[IntegralType] &&
299+
SQLConf.get.literalPickMinimumPrecision =>
298300
b.makeCopy(Array(l, Cast(r, DecimalType.fromLiteral(r))))
299301
// Promote integers inside a binary expression with fixed-precision decimals to decimals,
300302
// and fixed-precision decimals in an expression with floats / doubles to doubles

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1331,6 +1331,15 @@ object SQLConf {
13311331
.booleanConf
13321332
.createWithDefault(true)
13331333

1334+
val LITERAL_PICK_MINIMUM_PRECISION =
1335+
buildConf("spark.sql.legacy.literal.pickMinimumPrecision")
1336+
.internal()
1337+
.doc("When integral literal is used in decimal operations, pick a minimum precision " +
1338+
"required by the literal if this config is true, to make the resulting precision and/or " +
1339+
"scale smaller. This can reduce the possibility of precision lose and/or overflow.")
1340+
.booleanConf
1341+
.createWithDefault(true)
1342+
13341343
val SQL_OPTIONS_REDACTION_PATTERN =
13351344
buildConf("spark.sql.redaction.options.regex")
13361345
.doc("Regex to decide which keys in a Spark SQL command's options map contain sensitive " +
@@ -1925,6 +1934,8 @@ class SQLConf extends Serializable with Logging {
19251934

19261935
def decimalOperationsAllowPrecisionLoss: Boolean = getConf(DECIMAL_OPERATIONS_ALLOW_PREC_LOSS)
19271936

1937+
def literalPickMinimumPrecision: Boolean = getConf(LITERAL_PICK_MINIMUM_PRECISION)
1938+
19281939
def continuousStreamingExecutorQueueSize: Int = getConf(CONTINUOUS_STREAMING_EXECUTOR_QUEUE_SIZE)
19291940

19301941
def continuousStreamingExecutorPollIntervalMs: Long =

sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2849,6 +2849,13 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
28492849
val result = ds.flatMap(_.bar).distinct
28502850
result.rdd.isEmpty
28512851
}
2852+
2853+
test("SPARK-25454: decimal division with negative scale") {
2854+
// TODO: completely fix this issue even when LITERAL_PRECISE_PRECISION is true.
2855+
withSQLConf(SQLConf.LITERAL_PICK_MINIMUM_PRECISION.key -> "false") {
2856+
checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), Row(BigDecimal("26.3934994510000")))
2857+
}
2858+
}
28522859
}
28532860

28542861
case class Foo(bar: Option[String])

0 commit comments

Comments
 (0)