Skip to content

Implement Spark-compatible CAST from String to Decimal #325

@andygrove

Description

@andygrove

What is the problem the feature request solves?

What is the problem the feature request solves?

We currently delegate to DataFusion when casting from string to decimal and there are some differences in behavior compared to Spark.

  • An input of 4e7 produces 40000000.00 in Spark, and null in DataFusion
  • Inputs of ., -, + and empty string produce null in Spark, and 0.0 in DataFusion
  • Input of 0 produces 0 in Spark, and null in DataFusion
  • Arrow-rs does not support negative scale (Cannot cast string to decimal with negative scale). We could choose to fallback to Spark for this use case (or if SQLConf.LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED is enabled)

Describe the potential solution

No response

Additional context

I used the following test in CometCastSuite to explore this.

  test("cast string to decimal") {
    val values = generateStrings(numericPattern, 5).toDF("a")
    castTest(values, DataTypes.createDecimalType(10, 2))
    castTest(values, DataTypes.createDecimalType(10, 0))
    withSQLConf((SQLConf.LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key, "true")) {
      castTest(values, DataTypes.createDecimalType(10, -2))
    }
  }

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions