[SPARK-28989][SQL] Introduce ANSI SQL Dialect #25693

gengliangwang · 2019-09-05T08:45:52Z

What changes were proposed in this pull request?

Currently, there are new configurations for compatibility with ANSI SQL:

spark.sql.parser.ansi.enabled
spark.sql.decimalOperations.nullOnOverflow
spark.sql.failOnIntegralTypeOverflow
This PR is to add new configuration spark.sql.ansi.enabled and remove the 3 options above. When the configuration is true, Spark tries to conform to the ANSI SQL specification. It will be disabled by default.

Why are the changes needed?

Make it simple and straightforward.

Does this PR introduce any user-facing change?

The new features for ANSI compatibility will be set via one configuration spark.sql.ansi.enabled.

How was this patch tested?

Existing unit tests.

SparkQA · 2019-09-05T09:04:27Z

Test build #110171 has finished for PR 25693 at commit 196d03a.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-09-05T09:17:47Z

Test build #110173 has finished for PR 25693 at commit e563977.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2019-09-05T09:29:08Z

cc @mgaido91 @maropu @cloud-fan

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

mgaido91 · 2019-09-05T11:34:53Z

I am not sure about removing them... Shall we just use this config to set them, but leave the other configs for finer tuning?

SparkQA · 2019-09-05T12:01:20Z

Test build #110174 has finished for PR 25693 at commit 58cc6db.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-09-05T12:25:59Z

Test build #110175 has finished for PR 25693 at commit b89d918.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2019-09-05T12:38:23Z

@mgaido91 in practice it's a bad idea to have tons of flags that can change query result. When debugging a Spark query that returns unexpected result, it's very annoying if you need to check a lot of flags.

The legacy configs are ok as we clearly show the preference and these configs will be removed evetually, so not many people would set them.

mgaido91

shall we also add a note in the migration guide?

mgaido91 · 2019-09-05T12:50:18Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

      .createOptional

+  val ANSI_ENABLED = buildConf("spark.sql.ansi.enabled")
+    .doc("When true, tries to conform to the ANSI SQL specification. For example, Spark will " +


Instead of reporting as examples, shall we list clearly all the changes here?

SparkQA · 2019-09-05T13:09:33Z

Test build #110176 has finished for PR 25693 at commit 13c36a5.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

maropu

This change looks reasonable to me.

dongjoon-hyun · 2019-09-05T21:47:29Z

Hi, @gatorsmile , @gengliangwang , @cloud-fan .
Thank you for this effort to simplify them in Spark 3.0.0.
What will it be the relationship between spark.sql.dialect and spark.sql.ansi.enabled?
And, could you describe the reason why we don't choose spark.sql.dialect=ansi instead of spark.sql.ansi.enabled?

gengliangwang · 2019-09-06T03:49:39Z

@dongjoon-hyun There are features specified in ANSI SQL, such as throw error on numeric value overflow. We can switch these features with the option spark.sql.ansi.enabled.

ANSI SQL only specified high-level syntax and rules. There are "implementation-defined" behaviors, such as the result type of division of two exact number. We cant switch these detail behaviors via the option spark.sql.dialect.

gengliangwang · 2019-09-06T03:50:06Z

I will do more investigations and see if I can find a better solution for the configuration. Mark this as WIP for now.

gengliangwang · 2019-09-17T16:49:25Z

I will continue this one after the bug fix of #25804 is merged

SparkQA · 2019-09-18T15:34:27Z

Test build #110904 has finished for PR 25693 at commit 7c32fee.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-09-18T15:54:46Z

Test build #110906 has finished for PR 25693 at commit 342ce1c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gengliangwang · 2019-09-18T16:08:04Z

This is ready for review.
@cloud-fan @maropu @wangyum @mgaido91

dongjoon-hyun

LGTM.
cc @rxin , @gatorsmile

maropu · 2019-09-19T00:58:02Z

Looks ok to me, too.

gatorsmile · 2019-09-19T05:29:56Z

LGTM

Thanks! Merged to master.

gengliangwang force-pushed the ansiEnabled branch from e563977 to 58cc6db Compare September 5, 2019 09:11

gengliangwang requested review from cloud-fan and maropu September 5, 2019 09:28

gengliangwang removed request for cloud-fan and maropu September 5, 2019 09:29

wangyum reviewed Sep 5, 2019

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala Show resolved Hide resolved

mgaido91 reviewed Sep 5, 2019

View reviewed changes

maropu reviewed Sep 5, 2019

View reviewed changes

dongjoon-hyun added the SQL label Sep 5, 2019

gengliangwang changed the title ~~[SPARK-28989][SQL] Add spark.sql.ansi.enabled~~ [WIP][SPARK-28989][SQL] Add spark.sql.ansi.enabled Sep 6, 2019

gengliangwang mentioned this pull request Sep 16, 2019

[SPARK-29096][SQL] The exact math method should be called only when there is a corresponding function in Math #25804

Closed

gengliangwang added 4 commits September 18, 2019 17:08

Add spark.sql.ansi.enabled

87f4ed0

update comments

e41fca3

fix sql files

63643df

update sql/sql.out

da014fb

gengliangwang changed the title ~~[WIP][SPARK-28989][SQL] Add spark.sql.ansi.enabled~~ [SPARK-28989][SQL] Add spark.sql.ansi.enabled Sep 18, 2019

update

7c32fee

gengliangwang force-pushed the ansiEnabled branch from 13c36a5 to 7c32fee Compare September 18, 2019 11:15

revise

342ce1c

gengliangwang mentioned this pull request Sep 18, 2019

[SPARK-28997][SQL] Add spark.sql.dialect #25697

Closed

dongjoon-hyun approved these changes Sep 18, 2019

View reviewed changes

a typo

3a07455

gatorsmile changed the title ~~[SPARK-28989][SQL] Add spark.sql.ansi.enabled~~ [SPARK-28989][SQL] Add a SQLConf spark.sql.ansi.enabled Sep 19, 2019

gatorsmile closed this in b917a65 Sep 19, 2019

MaxGekk mentioned this pull request Sep 19, 2019

[SPARK-24695][SQL] Move CalendarInterval to org.apache.spark.sql.types package #25022

Closed

gengliangwang changed the title ~~[SPARK-28989][SQL] Add a SQLConf spark.sql.ansi.enabled~~ [SPARK-28989][SQL] Introduce ANSI SQL Dialect Aug 9, 2024

[SPARK-28989][SQL] Introduce ANSI SQL Dialect #25693

[SPARK-28989][SQL] Introduce ANSI SQL Dialect #25693

Uh oh!

Conversation

gengliangwang commented Sep 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Sep 5, 2019

Uh oh!

SparkQA commented Sep 5, 2019

Uh oh!

gengliangwang commented Sep 5, 2019

Uh oh!

Uh oh!

mgaido91 commented Sep 5, 2019

Uh oh!

SparkQA commented Sep 5, 2019

Uh oh!

SparkQA commented Sep 5, 2019

Uh oh!

cloud-fan commented Sep 5, 2019

Uh oh!

mgaido91 left a comment

Choose a reason for hiding this comment

Uh oh!

mgaido91 Sep 5, 2019

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 5, 2019

Uh oh!

maropu left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Sep 5, 2019

Uh oh!

gengliangwang commented Sep 6, 2019

Uh oh!

gengliangwang commented Sep 6, 2019

Uh oh!

gengliangwang commented Sep 17, 2019

Uh oh!

SparkQA commented Sep 18, 2019

Uh oh!

SparkQA commented Sep 18, 2019

Uh oh!

gengliangwang commented Sep 18, 2019

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

maropu commented Sep 19, 2019

Uh oh!

gatorsmile commented Sep 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

gengliangwang commented Sep 5, 2019 •

edited

Loading