Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32844][SQL] Make DataFrameReader.table take the specified options for datasource v1 #29712

Closed
wants to merge 5 commits into from

Conversation

xuanyuanking
Copy link
Member

@xuanyuanking xuanyuanking commented Sep 10, 2020

What changes were proposed in this pull request?

Make DataFrameReader.table take the specified options for datasource v1.

Why are the changes needed?

Keep the same behavior of v1/v2 datasource, the v2 fix has been done in SPARK-32592.

Does this PR introduce any user-facing change?

Yes. The DataFrameReader.table will take the specified options. Also, if there are the same key and value exists in specified options and table properties, an exception will be thrown.

How was this patch tested?

New UT added.

@xuanyuanking xuanyuanking changed the title [SPARK-32844]Make DataFrameReader.table take the specified options for datasource v1 [SPARK-32844][SQL] Make DataFrameReader.table take the specified options for datasource v1 Sep 10, 2020
@xuanyuanking
Copy link
Member Author

cc @cloud-fan

@@ -251,24 +255,25 @@ class FindDataSourceTable(sparkSession: SparkSession) extends Rule[LogicalPlan]
partitionColumns = table.partitionColumnNames,
bucketSpec = table.bucketSpec,
className = table.provider.get,
options = table.storage.properties ++ pathOption,
options = extraOptions.asCaseSensitiveMap.asScala.toMap
++ table.storage.properties ++ pathOption,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's say table properties have key=1, and extraOptions have KEY=1. The check can pass, but the final options have both key and KEY. We should avoid it and keeping key from table properties.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done in cc250b9.

@SparkQA
Copy link

SparkQA commented Sep 10, 2020

Test build #128511 has finished for PR 29712 at commit 33b8fca.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class UnresolvedCatalogRelation(

@SparkQA
Copy link

SparkQA commented Sep 10, 2020

Test build #128518 has finished for PR 29712 at commit b4aa95e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -190,4 +192,15 @@ object DataSourceUtils {
case LegacyBehaviorPolicy.LEGACY => RebaseDateTime.rebaseGregorianToJulianMicros
case LegacyBehaviorPolicy.CORRECTED => identity[Long]
}

private[sql] def checkDuplicateOptions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove private[sql] since execution package is already private (SPARK-16964)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, thanks for reminding, done in cc250b9.

@SparkQA
Copy link

SparkQA commented Sep 11, 2020

Test build #128564 has finished for PR 29712 at commit cc250b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 11, 2020

Test build #128565 has finished for PR 29712 at commit 05aa245.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 12, 2020

Test build #128579 has finished for PR 29712 at commit f3f8c8d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 5e82548 Sep 14, 2020
@xuanyuanking xuanyuanking deleted the SPARK-32844 branch September 14, 2020 09:31
@xuanyuanking
Copy link
Member Author

Thanks for the review.

huaxingao pushed a commit that referenced this pull request Sep 23, 2021
…ation

### What changes were proposed in this pull request?

This PR fixes the 'options' description on `UnresolvedRelation`. This comment was added in #29535 but not valid anymore because V1 also uses this `options` (and merge the options with the table properties) per #29712.

This PR can go through from `master` to `branch-3.1`.

### Why are the changes needed?

To make `UnresolvedRelation.options`'s description clearer.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Scala linter by `dev/linter-scala`.

Closes #34075 from HyukjinKwon/minor-comment-unresolved-releation.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
huaxingao pushed a commit that referenced this pull request Sep 23, 2021
…ation

### What changes were proposed in this pull request?

This PR fixes the 'options' description on `UnresolvedRelation`. This comment was added in #29535 but not valid anymore because V1 also uses this `options` (and merge the options with the table properties) per #29712.

This PR can go through from `master` to `branch-3.1`.

### Why are the changes needed?

To make `UnresolvedRelation.options`'s description clearer.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Scala linter by `dev/linter-scala`.

Closes #34075 from HyukjinKwon/minor-comment-unresolved-releation.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
(cherry picked from commit 0076eba)
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
huaxingao pushed a commit that referenced this pull request Sep 23, 2021
…ation

### What changes were proposed in this pull request?

This PR fixes the 'options' description on `UnresolvedRelation`. This comment was added in #29535 but not valid anymore because V1 also uses this `options` (and merge the options with the table properties) per #29712.

This PR can go through from `master` to `branch-3.1`.

### Why are the changes needed?

To make `UnresolvedRelation.options`'s description clearer.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Scala linter by `dev/linter-scala`.

Closes #34075 from HyukjinKwon/minor-comment-unresolved-releation.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
(cherry picked from commit 0076eba)
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants