Skip to content

Conversation

@imback82
Copy link
Contributor

What changes were proposed in this pull request?

Add RefreshTableStatement and make REFRESH TABLE go through the same catalog/table resolution framework of v2 commands.

Why are the changes needed?

It's important to make all the commands have the same table resolution behavior, to avoid confusing end-users. e.g.

USE my_catalog
DESC t // success and describe the table t from my_catalog
REFRESH TABLE t // report table not found as there is no table t in the session catalog

Does this PR introduce any user-facing change?

yes. When running REFRESH TABLE, Spark fails the command if the current catalog is set to a v2 catalog, or the table name specified a v2 catalog.

How was this patch tested?

New unit tests

@imback82
Copy link
Contributor Author

cc: @cloud-fan @rdblue @viirya

@imback82 imback82 changed the title [SPARK-29512][SQL] REFRESH TABLE should look up catalog/table like v2 commands [SPARK-29513][SQL] REFRESH TABLE should look up catalog/table like v2 commands Oct 20, 2019
@SparkQA
Copy link

SparkQA commented Oct 20, 2019

Test build #112343 has finished for PR 26183 at commit dc52212.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class RefreshTableStatement(tableName: Seq[String]) extends ParsedStatement

"MSCK REPAIR TABLE")

case RefreshTableStatement(tableName) =>
val v1TableName = parseV1Table(tableName, "REFRESH TABLE")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have TableCatalog.invalidateTable to implement a v2 REFRESH TABLE command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed by TableCatalog.loadTable? OrTableCatalog.loadTable is called lazily (you don't call it inside RefreshTableExec - in this case, same behavior as UNCACHE TABLE)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not followed by loadTable. It should invalidate and load the table lazily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I adde v2 command in this PR.

@rdblue
Copy link
Contributor

rdblue commented Oct 21, 2019

Looks fine to me. We can add the implementation here or in a separate PR.

val t = "testcat.ns1.ns2.tbl"
withTable(t) {
sql(s"CREATE TABLE $t (id bigint, data string) USING foo")
sql(s"REFRESH TABLE $t")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestion for a reasonable test here?

ALTER TABLE doesn't require REFRESH TABLE in the following existing test, correct?

      sql(s"CREATE TABLE $t (id int, data string) USING $v2Format")
      sql(s"ALTER TABLE $t DROP COLUMN data")
      val table = getTableMetadata(t) // <- this calls TableCatalog.loadTable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a reasonable test is to create a custom v2 implementation which implements invalidateTable.

Copy link
Contributor

@cloud-fan cloud-fan Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can implement InMemoryTable.invalidateTable which simply set a flag on a global object to indicate it has been called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! Exactly what I was planning to do. :)

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112439 has finished for PR 26183 at commit fcc31f2.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class RefreshTable(
  • case class RefreshTableExec(

@imback82
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112462 has finished for PR 26183 at commit 7b884e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • skeleton_class = type_constructor(name, bases, type_kwargs)
  • enum_class = metacls.__new__(metacls, name, bases, classdict)
  • case class BitAndAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class BitOrAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • abstract class AbstractSqlParser(conf: SQLConf) extends ParserInterface with Logging
  • class CatalystSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)
  • case class TruncateTableStatement(
  • class SparkSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112466 has finished for PR 26183 at commit 7b884e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • skeleton_class = type_constructor(name, bases, type_kwargs)
  • enum_class = metacls.__new__(metacls, name, bases, classdict)
  • case class BitAndAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • case class BitOrAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes
  • abstract class AbstractSqlParser(conf: SQLConf) extends ParserInterface with Logging
  • class CatalystSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)
  • case class TruncateTableStatement(
  • class SparkSqlParser(conf: SQLConf) extends AbstractSqlParser(conf)

@SparkQA
Copy link

SparkQA commented Oct 22, 2019

Test build #112477 has finished for PR 26183 at commit fe577f5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112492 has finished for PR 26183 at commit 3a1da85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ShowPartitionsStatement(

@cloud-fan
Copy link
Contributor

LGTM, but we need to resolve conflicts again...

@imback82
Copy link
Contributor Author

Just resolved conflicts. Thanks!

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112507 has finished for PR 26183 at commit 31bfc6f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class CreateNamespaceStatement(
  • case class CreateNamespace(
  • case class CreateNamespaceExec(

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112519 has finished for PR 26183 at commit 31bfc6f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class CreateNamespaceStatement(
  • case class CreateNamespace(
  • case class CreateNamespaceExec(

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112522 has finished for PR 26183 at commit 31bfc6f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class CreateNamespaceStatement(
  • case class CreateNamespace(
  • case class CreateNamespaceExec(

@SparkQA
Copy link

SparkQA commented Oct 23, 2019

Test build #112535 has finished for PR 26183 at commit 31bfc6f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class CreateNamespaceStatement(
  • case class CreateNamespace(
  • case class CreateNamespaceExec(

@viirya
Copy link
Member

viirya commented Oct 23, 2019

Thanks! Merging to master.

@viirya viirya closed this in 53a5f17 Oct 23, 2019
@imback82 imback82 deleted the refresh_table branch October 23, 2019 15:45
@imback82
Copy link
Contributor Author

Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants