Skip to content

[SPARK-29728][SQL] Datasource V2: Support ALTER TABLE RENAME TO #26539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

imback82
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds ALTER TABLE a.b.c RENAME TO x.y.x support for V2 catalogs.

Why are the changes needed?

The current implementation doesn't support this command V2 catalogs.

Does this PR introduce any user-facing change?

Yes, now the renaming table works for v2 catalogs:

scala> spark.sql("SHOW TABLES IN testcat.ns1.ns2").show
+---------+---------+
|namespace|tableName|
+---------+---------+
|  ns1.ns2|      old|
+---------+---------+

scala> spark.sql("ALTER TABLE testcat.ns1.ns2.old RENAME TO testcat.ns1.ns2.new").show

scala> spark.sql("SHOW TABLES IN testcat.ns1.ns2").show
+---------+---------+
|namespace|tableName|
+---------+---------+
|  ns1.ns2|      new|
+---------+---------+

How was this patch tested?

Added unit tests.

@imback82
Copy link
Contributor Author

cc: @cloud-fan @rdblue @viirya

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29728][SQL] Datasource V2: Support ALTER TABLE (RENAME TABLE) [SPARK-29728][SQL] Datasource V2: Support ALTER TABLE RENAME TO Nov 15, 2019
@@ -93,6 +93,22 @@ class ResolveCatalogs(val catalogManager: CatalogManager)
s"Can not specify catalog `${catalog.name}` for view ${tableName.quoted} " +
s"because view support in catalog has not been implemented yet")

case RenameTableStatement(NonSessionCatalog(catalog, oldName), newNameParts, isView) =>
if (isView) {
throw new AnalysisException("Renaming view is not supported in v2 catalogs.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test coverage for this too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. I will add one.

throw new AnalysisException("Renaming view is not supported in v2 catalogs.")
}
newNameParts match {
case NonSessionCatalog(newCatalog, newName) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a name conflict, this forces users to always specify the catalog name.

I think it's easier to define the newNameParts as the identifier of the target catalog, so that we can just return
RenameTable(atalog.asTableCatalog, oldName.asIdentifier, newNameParts.asIdentifier)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, but wanted to make sure if I understood correctly.

Are you suggesting newNameParts will contain only the identifier info, not the catalog name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, since we won't support cross catalog renaming (it's not renaming anymore), I don't think it's necessary to contain catalog name in the newNameParts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that should simplify the logic. But, I have one concern. For the following:

ALTER TABLE testcat.ns1.ns2.old RENAME TO testcat.ns1.ns2.new

, ns1.ns2.old will be renamed to testcat.ns1.ns2.new. Is this the expected behavior or should we handle it gracefully?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is the expected behavior.

@@ -158,6 +158,14 @@ class ResolveSessionCatalog(
case AlterViewUnsetPropertiesStatement(SessionCatalog(catalog, tableName), keys, ifExists) =>
AlterTableUnsetPropertiesCommand(tableName.asTableIdentifier, keys, ifExists, isView = true)

case RenameTableStatement(SessionCatalog(_, oldName), newNameParts, isView) =>
newNameParts match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@SparkQA
Copy link

SparkQA commented Nov 15, 2019

Test build #113844 has finished for PR 26539 at commit bd3b844.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 18, 2019

Test build #114033 has finished for PR 26539 at commit b6be4a9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in 3d45779 Nov 19, 2019
@cloud-fan
Copy link
Contributor

thanks, merging to master!

/**
* Physical plan node for renaming a table.
*/
case class RenameTableExec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any specific reasons for naming it as RenameTableExec instead of AlterTableRenameExec similarly to other alter table exec nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there was a specific reason. Most likely a byproduct of following TableCatalog.renameTable and visitRenameTable naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a big deal, and I prefer RenameTableExec as it's shorter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DropPartitionExec is shorter than AlterTableDropPartitionExec too but we have the AlterTable prefix in all ALTER TABLE exec nodes.

Looks slightly strange that all alter table commands begin from AlterTable except only one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many AlterTable* v2 commands out there? If it's a lot then let's update this one as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a a lot but we have for now:

  • AlterTableAddPartitionExec
  • AlterTableRenamePartitionExec
  • AlterTableDropPartitionExec
  • AlterTableExec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's probably better to change them to AddPartitionExec, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the PR #31584

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants