-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29513][SQL] REFRESH TABLE should look up catalog/table like v2 commands #26183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc: @cloud-fan @rdblue @viirya |
|
Test build #112343 has finished for PR 26183 at commit
|
| "MSCK REPAIR TABLE") | ||
|
|
||
| case RefreshTableStatement(tableName) => | ||
| val v1TableName = parseV1Table(tableName, "REFRESH TABLE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have TableCatalog.invalidateTable to implement a v2 REFRESH TABLE command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followed by TableCatalog.loadTable? OrTableCatalog.loadTable is called lazily (you don't call it inside RefreshTableExec - in this case, same behavior as UNCACHE TABLE)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not followed by loadTable. It should invalidate and load the table lazily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I adde v2 command in this PR.
|
Looks fine to me. We can add the implementation here or in a separate PR. |
| val t = "testcat.ns1.ns2.tbl" | ||
| withTable(t) { | ||
| sql(s"CREATE TABLE $t (id bigint, data string) USING foo") | ||
| sql(s"REFRESH TABLE $t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestion for a reasonable test here?
ALTER TABLE doesn't require REFRESH TABLE in the following existing test, correct?
sql(s"CREATE TABLE $t (id int, data string) USING $v2Format")
sql(s"ALTER TABLE $t DROP COLUMN data")
val table = getTableMetadata(t) // <- this calls TableCatalog.loadTableThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a reasonable test is to create a custom v2 implementation which implements invalidateTable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can implement InMemoryTable.invalidateTable which simply set a flag on a global object to indicate it has been called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! Exactly what I was planning to do. :)
|
Test build #112439 has finished for PR 26183 at commit
|
|
retest this please |
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/RefreshTableExec.scala
Show resolved
Hide resolved
|
Test build #112462 has finished for PR 26183 at commit
|
|
Test build #112466 has finished for PR 26183 at commit
|
|
Test build #112477 has finished for PR 26183 at commit
|
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good.
|
Test build #112492 has finished for PR 26183 at commit
|
|
LGTM, but we need to resolve conflicts again... |
|
Just resolved conflicts. Thanks! |
|
Test build #112507 has finished for PR 26183 at commit
|
|
retest this please |
|
Test build #112519 has finished for PR 26183 at commit
|
|
retest this please |
|
Test build #112522 has finished for PR 26183 at commit
|
|
Test build #112535 has finished for PR 26183 at commit
|
|
Thanks! Merging to master. |
|
Thanks all! |
What changes were proposed in this pull request?
Add RefreshTableStatement and make REFRESH TABLE go through the same catalog/table resolution framework of v2 commands.
Why are the changes needed?
It's important to make all the commands have the same table resolution behavior, to avoid confusing end-users. e.g.
Does this PR introduce any user-facing change?
yes. When running REFRESH TABLE, Spark fails the command if the current catalog is set to a v2 catalog, or the table name specified a v2 catalog.
How was this patch tested?
New unit tests