Skip to content

[SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions #36649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

beliefer
Copy link
Contributor

What changes were proposed in this pull request?

The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

Why are the changes needed?

JDBC dialect supports registering dialect specific functions

Does this PR introduce any user-facing change?

'No'.
New feature.

How was this patch tested?

New tests.

@github-actions github-actions bot added the SQL label May 24, 2022
@beliefer
Copy link
Contributor Author

ping @huaxingao cc @cloud-fan

@@ -816,6 +818,15 @@ class JDBCSuite extends QueryTest
}
}

test("register dialect specific functions") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not reflect the expectation. The user story should be: if an end-user registers a JDBC catalog with a certain dialect, he/she can directly call functions like SELECT myCatalog.database.funcName(...) if the function is registered by the dialect.

The workflow should be

  1. JDBC dialect report the functions it supports
  2. Spark (JDBCTableCatalog) registers these function
  3. end-users call these functions in their queries

}

override def loadFunction(ident: Identifier): UnboundFunction = {
dialect.functions.toMap.get(ident) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should create the map only once in JDBCTableCatalog, instead of every time we look up a function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also consider case sensitivity.

@@ -1412,4 +1413,15 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with ExplainSuiteHel
}
}
}

test("register dialect specific functions") {
H2Dialect.registerFunction("my_avg", IntegralAverage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a try-catch-finally to clear the registered functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it's hard to do clearFunctions for one test case only, how about we do the following:

  1. in beforeAll, we register functions to H2Dialect
  2. in afterAll, we clear functions in H2Dialect

The result is, the entire test suite will test against a JDBC catalog with a UDF. Other test suites will use a fresh SparkSession and instantiate new JDBCTabeCatalog instance and won't be affected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

if (namespace.isEmpty) {
functions.keys.map(Identifier.of(namespace, _)).toArray
} else {
throw QueryCompilationErrors.noSuchNamespaceError(namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can return empty array here.


override def loadFunction(ident: Identifier): UnboundFunction = {
if (ident.namespace().nonEmpty) {
throw QueryCompilationErrors.namespaceInJdbcUDFUnsupportedError(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can throw NoSuchFunctionException here

}

// test only
def clearFunctions(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's weird that we put registerFunction in H2Dialect but put clearFunctions here.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 2a7a1b6 May 27, 2022
@beliefer
Copy link
Contributor Author

@cloud-fan Thank you for review this PR.

chenzhx pushed a commit to chenzhx/spark that referenced this pull request Jun 13, 2022
… functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
chenzhx pushed a commit to chenzhx/spark that referenced this pull request Jun 15, 2022
… functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
chenzhx added a commit to Kyligence/spark that referenced this pull request Jun 15, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
yabola pushed a commit to Kyligence/spark that referenced this pull request Jun 21, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
leejaywei pushed a commit to Kyligence/spark that referenced this pull request Jul 14, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

JDBC dialect supports registering dialect specific functions

'No'.
New feature.

New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

Capitalize sql keywords in `JDBCV2Suite`.

'No'.
Just update test cases.

N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

This PR fix the bug for `PushableColumnWithoutNestedColumn`.

'Yes'.
Let the push-down framework more correctly.

New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

yes, bug fix

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
zheniantoushipashi pushed a commit to Kyligence/spark that referenced this pull request Aug 8, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
RolatZhang pushed a commit to Kyligence/spark that referenced this pull request Aug 29, 2023
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

JDBC dialect supports registering dialect specific functions

'No'.
New feature.

New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

Capitalize sql keywords in `JDBCV2Suite`.

'No'.
Just update test cases.

N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

This PR fix the bug for `PushableColumnWithoutNestedColumn`.

'Yes'.
Let the push-down framework more correctly.

New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

yes, bug fix

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants