Skip to content

Conversation

@gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Sep 5, 2019

What changes were proposed in this pull request?

Currently, there are new configurations for compatibility with ANSI SQL:

  • spark.sql.parser.ansi.enabled
  • spark.sql.decimalOperations.nullOnOverflow
  • spark.sql.failOnIntegralTypeOverflow
    This PR is to add new configuration spark.sql.ansi.enabled and remove the 3 options above. When the configuration is true, Spark tries to conform to the ANSI SQL specification. It will be disabled by default.

Why are the changes needed?

Make it simple and straightforward.

Does this PR introduce any user-facing change?

The new features for ANSI compatibility will be set via one configuration spark.sql.ansi.enabled.

How was this patch tested?

Existing unit tests.

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110171 has finished for PR 25693 at commit 196d03a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110173 has finished for PR 25693 at commit e563977.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

cc @mgaido91 @maropu @cloud-fan

@mgaido91
Copy link
Contributor

mgaido91 commented Sep 5, 2019

I am not sure about removing them... Shall we just use this config to set them, but leave the other configs for finer tuning?

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110174 has finished for PR 25693 at commit 58cc6db.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110175 has finished for PR 25693 at commit b89d918.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@mgaido91 in practice it's a bad idea to have tons of flags that can change query result. When debugging a Spark query that returns unexpected result, it's very annoying if you need to check a lot of flags.

The legacy configs are ok as we clearly show the preference and these configs will be removed evetually, so not many people would set them.

Copy link
Contributor

@mgaido91 mgaido91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also add a note in the migration guide?

.createOptional

val ANSI_ENABLED = buildConf("spark.sql.ansi.enabled")
.doc("When true, tries to conform to the ANSI SQL specification. For example, Spark will " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of reporting as examples, shall we list clearly all the changes here?

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110176 has finished for PR 25693 at commit 13c36a5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks reasonable to me.

@dongjoon-hyun
Copy link
Member

Hi, @gatorsmile , @gengliangwang , @cloud-fan .
Thank you for this effort to simplify them in Spark 3.0.0.
What will it be the relationship between spark.sql.dialect and spark.sql.ansi.enabled?
And, could you describe the reason why we don't choose spark.sql.dialect=ansi instead of spark.sql.ansi.enabled?

@gengliangwang
Copy link
Member Author

@dongjoon-hyun There are features specified in ANSI SQL, such as throw error on numeric value overflow. We can switch these features with the option spark.sql.ansi.enabled.

ANSI SQL only specified high-level syntax and rules. There are "implementation-defined" behaviors, such as the result type of division of two exact number. We cant switch these detail behaviors via the option spark.sql.dialect.

@gengliangwang
Copy link
Member Author

I will do more investigations and see if I can find a better solution for the configuration. Mark this as WIP for now.

@gengliangwang gengliangwang changed the title [SPARK-28989][SQL] Add spark.sql.ansi.enabled [WIP][SPARK-28989][SQL] Add spark.sql.ansi.enabled Sep 6, 2019
@gengliangwang
Copy link
Member Author

I will continue this one after the bug fix of #25804 is merged

@gengliangwang gengliangwang changed the title [WIP][SPARK-28989][SQL] Add spark.sql.ansi.enabled [SPARK-28989][SQL] Add spark.sql.ansi.enabled Sep 18, 2019
@SparkQA
Copy link

SparkQA commented Sep 18, 2019

Test build #110904 has finished for PR 25693 at commit 7c32fee.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 18, 2019

Test build #110906 has finished for PR 25693 at commit 342ce1c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

This is ready for review.
@cloud-fan @maropu @wangyum @mgaido91

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
cc @rxin , @gatorsmile

@maropu
Copy link
Member

maropu commented Sep 19, 2019

Looks ok to me, too.

@gatorsmile gatorsmile changed the title [SPARK-28989][SQL] Add spark.sql.ansi.enabled [SPARK-28989][SQL] Add a SQLConf spark.sql.ansi.enabled Sep 19, 2019
@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master.

@gengliangwang gengliangwang changed the title [SPARK-28989][SQL] Add a SQLConf spark.sql.ansi.enabled [SPARK-28989][SQL] Introduce ANSI SQL Dialect Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants