Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33787][SQL] Allow partition purge for v2 tables #30886

Closed
wants to merge 8 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 22, 2020

What changes were proposed in this pull request?

  1. Add new methods purgePartition()/purgePartitions() to the interfaces SupportsPartitionManagement/SupportsAtomicPartitionManagement.
  2. Default implementation of new methods throw the exception UnsupportedOperationException.
  3. Add tests for new methods to SupportsPartitionManagementSuite/SupportsAtomicPartitionManagementSuite.
  4. Add ALTER TABLE .. DROP PARTITION tests for DS v1 and v2.

Closes #30776
Closes #30821

Why are the changes needed?

Currently, the PURGE option that user can set in ALTER TABLE .. DROP PARTITION is completely ignored. We should pass this flag to the catalog implementation, so, the catalog should decide how to handle the flag.

Does this PR introduce any user-facing change?

The changes can impact on behavior of ALTER TABLE .. DROP PARTITION for v2 tables.

How was this patch tested?

By running the affected test suites, for instance:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableDropPartitionSuite"

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 22, 2020

@cloud-fan @HyukjinKwon Could you, please, review this PR.

@github-actions github-actions bot added the SQL label Dec 22, 2020
@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37803/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37804/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37803/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37804/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133206 has finished for PR 30886 at commit 43a7761.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 22, 2020

The failure #30886 (comment) is not related to PR's changes, I hope:

Caused by: org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 22, 2020

jenkins, retest this, please

* @throws UnsupportedOperationException If partition purging is not supported
*/
boolean purgePartitions(InternalRow[] idents)
throws NoSuchPartitionException, UnsupportedOperationException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for backward compatibility, let's give a default implementation that fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility with what? The interface is not released as far as I know or I am wrong?

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133204 has finished for PR 30886 at commit b7f2b15.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37821/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37821/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133212 has finished for PR 30886 at commit 43a7761.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133224 has finished for PR 30886 at commit 13bfe3e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

HyukjinKwon pushed a commit that referenced this pull request Dec 23, 2020
### What changes were proposed in this pull request?

This is a followup of #30267

Inspired by #30886, it's better to have 2 methods `def dropTable` and `def purgeTable`, than `def dropTable(ident)` and `def dropTable(ident, purge)`.

### Why are the changes needed?

1. make the APIs orthogonal. Previously, `def dropTable(ident, purge)` calls `def dropTable(ident)` and is a superset.
2. simplifies the catalog implementation a little bit. Now the `if (purge) ... else ...` check is done at the Spark side.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

existing tests

Closes #30890 from cloud-fan/purgeTable.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Dec 23, 2020
This is a followup of #30267

Inspired by #30886, it's better to have 2 methods `def dropTable` and `def purgeTable`, than `def dropTable(ident)` and `def dropTable(ident, purge)`.

1. make the APIs orthogonal. Previously, `def dropTable(ident, purge)` calls `def dropTable(ident)` and is a superset.
2. simplifies the catalog implementation a little bit. Now the `if (purge) ... else ...` check is done at the Spark side.

No.

existing tests

Closes #30890 from cloud-fan/purgeTable.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit ec1560a)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 34bfb3a Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants