-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-30282][SQL] Migrate SHOW TBLPROPERTIES to new framework #26921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #115436 has finished for PR 26921 at commit
|
Test build #115437 has finished for PR 26921 at commit
|
Test build #115447 has finished for PR 26921 at commit
|
Test build #115445 has finished for PR 26921 at commit
|
Test build #115470 has finished for PR 26921 at commit
|
cc: @cloud-fan |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Outdated
Show resolved
Hide resolved
I have some ideas about how to move this forward, @imback82 can you take a look at #26847 (comment) and share your thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I integrated two commands that were using UnresolvedV2Relation
(ShowTableProperties
and DescribeTable
). Please let me know if the approach looks fine, then I will finish up AlterTable
(and move onto other commands). Thanks!
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/connector/TableResolutionSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala
Outdated
Show resolved
Hide resolved
Test build #116343 has finished for PR 26921 at commit
|
Test build #116341 has finished for PR 26921 at commit
|
} else { | ||
if (partitionSpec.nonEmpty) { | ||
throw new AnalysisException("DESCRIBE TABLE does not support partition for v2 tables.") | ||
case d @ DescribeTable(SessionCatalogAndResolvedTable(resolved), partitionSpec, isExtended) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems simpler to write
d @ DescribeTable(r: ResolvedTable, partitionSpec, isExtended) if isSessionCatalog(r.catalog) =>
DescribeTableCommand(getTableIdentifier(resolved), partitionSpec, isExtended) | ||
case _ => | ||
// The v1 `DescribeTableCommand` can describe view as well. | ||
if (isView(resolved.identifier.asMultipartIdentifier)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before we reach here, we may already fail with table/view mismatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I checked with hive: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeTable/View/MaterializedView/Column
The DESCRIBE command works for both table and view, in Spark we have a weird parser rule DESC TABLE?
which makes people think it's for table only.
We can keep the parser rule, but we should rename DescribeTable
to DescribeRelation
, and use UnresolvedRelation
to allow both table and view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for presto: https://prestodb.io/docs/current/sql/describe.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've open #27187 to refine it.
This is a feature we missed when designing the new framework, so I opened a separated PR to update the framework to support accepting both table and view like DESCRIBE command.
Based on the discussion from #26847 (comment), should I make this PR WIP and try to integrate all commands first to verify the new framework is solid? Afterwards, we can split this PR into multiple ones. |
SGTM |
### What changes were proposed in this pull request? Use the new framework to resolve the DESCRIBE TABLE command. The v1 DESCRIBE TABLE command supports both table and view. Checked with Hive and Presto, they don't have DESCRIBE TABLE syntax but only DESCRIBE, which supports both table and view: 1. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeTable/View/MaterializedView/Column 2. https://prestodb.io/docs/current/sql/describe.html We should make it clear that DESCRIBE support both table and view, by renaming the command to `DescribeRelation`. This PR also tunes the framework a little bit to support the case that a command accepts both table and view. ### Why are the changes needed? This is a part of effort to make the relation lookup behavior consistent: SPARK-29900. Note that I make a separate PR here instead of #26921, as I need to update the framework to support a new use case: accept both table and view. ### Does this PR introduce any user-facing change? no ### How was this patch tested? existing tests Closes #27187 from cloud-fan/describe. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@cloud-fan, actually, it was a big change to migrate all commands into one WIP PR (also prob. hard to review anyway). I just changed this PR to only migrate |
Test build #116877 has finished for PR 26921 at commit
|
Test build #116876 has finished for PR 26921 at commit
|
val message = intercept[AnalysisException] { | ||
sql("SHOW TBLPROPERTIES parquet_temp") | ||
}.getMessage | ||
assert(message.contains("parquet_temp is a temp view not table")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change LGTM, but can we add a note in the migration guide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will add it. Thanks!
LGTM if tests pass |
Test build #116889 has finished for PR 26921 at commit
|
Test build #116890 has finished for PR 26921 at commit
|
Test build #116905 has finished for PR 26921 at commit
|
the last commit is just adding migration guide, and the previous commit passes all tests. I'm merging it to master, thanks! |
@@ -130,10 +130,10 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto | |||
} | |||
|
|||
test("show tblproperties for datasource table - errors") { | |||
val message1 = intercept[NoSuchTableException] { | |||
val message = intercept[AnalysisException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, this sounds like another behavior change from Apache Spark 2.4.4.
cc @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching it! I will update the migration guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a follow-up PR: #27276
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE VIEW gatorsmileView1
TBLPROPERTIES(prop1="3")
AS SELECT 1 AS c1, 2 AS c2
SHOW TBLPROPERTIES gatorsmileView1
This does not work after the migration. @imback82 @cloud-fan We need to double check whether the other migration causes the same regression.
val catalog = sparkSession.sessionState.catalog | ||
|
||
if (catalog.isTemporaryTable(table)) { | ||
Seq.empty[Row] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We allow users to specify the table properties when creating temporary views.
CREATE TEMPORARY VIEW v1
TBLPROPERTIES(prop1 ="3")
AS SELECT 1 AS c1, 2 AS c2
Instead of disallowing it, I think we should support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I will create a PR for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon looking at this further, properties specified in CREATE TEMP VIEW
is not used in CreateViewCommand
, so we can revert the behavior back to the original one where we showed empty properties. What do you think?
I will still create a PR to support a permanent view, which is a regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me. please remove the item in the migration guide too.
Thanks!
Let me take a look. |
Since |
It's unfortunate that many TABLE commands in Spark support view as well. We need to be careful when we migrate commands to v2 later. @imback82 can you send a PR to make SHOW TBLPROPERTIES support views? |
I created a PR: #28375 |
### What changes were proposed in this pull request? This PR addresses two things: - `SHOW TBLPROPERTIES` should supports view (a regression introduced by #26921) - `SHOW TBLPROPERTIES` on a temporary view should return empty result (2.4 behavior instead of throwing `AnalysisException`. ### Why are the changes needed? It's a bug. ### Does this PR introduce any user-facing change? Yes, now `SHOW TBLPROPERTIES` works on views: ``` scala> sql("CREATE VIEW view TBLPROPERTIES('p1'='v1', 'p2'='v2') AS SELECT 1 AS c1") scala> sql("SHOW TBLPROPERTIES view").show(truncate=false) +---------------------------------+-------------+ |key |value | +---------------------------------+-------------+ |view.catalogAndNamespace.numParts|2 | |view.query.out.col.0 |c1 | |view.query.out.numCols |1 | |p2 |v2 | |view.catalogAndNamespace.part.0 |spark_catalog| |p1 |v1 | |view.catalogAndNamespace.part.1 |default | +---------------------------------+-------------+ ``` And for a temporary view: ``` scala> sql("CREATE TEMPORARY VIEW tview TBLPROPERTIES('p1'='v1', 'p2'='v2') AS SELECT 1 AS c1") scala> sql("SHOW TBLPROPERTIES tview").show(truncate=false) +---+-----+ |key|value| +---+-----+ +---+-----+ ``` ### How was this patch tested? Added tests. Closes #28375 from imback82/show_tblproperties_followup. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request? This PR addresses two things: - `SHOW TBLPROPERTIES` should supports view (a regression introduced by #26921) - `SHOW TBLPROPERTIES` on a temporary view should return empty result (2.4 behavior instead of throwing `AnalysisException`. ### Why are the changes needed? It's a bug. ### Does this PR introduce any user-facing change? Yes, now `SHOW TBLPROPERTIES` works on views: ``` scala> sql("CREATE VIEW view TBLPROPERTIES('p1'='v1', 'p2'='v2') AS SELECT 1 AS c1") scala> sql("SHOW TBLPROPERTIES view").show(truncate=false) +---------------------------------+-------------+ |key |value | +---------------------------------+-------------+ |view.catalogAndNamespace.numParts|2 | |view.query.out.col.0 |c1 | |view.query.out.numCols |1 | |p2 |v2 | |view.catalogAndNamespace.part.0 |spark_catalog| |p1 |v1 | |view.catalogAndNamespace.part.1 |default | +---------------------------------+-------------+ ``` And for a temporary view: ``` scala> sql("CREATE TEMPORARY VIEW tview TBLPROPERTIES('p1'='v1', 'p2'='v2') AS SELECT 1 AS c1") scala> sql("SHOW TBLPROPERTIES tview").show(truncate=false) +---+-----+ |key|value| +---+-----+ +---+-----+ ``` ### How was this patch tested? Added tests. Closes #28375 from imback82/show_tblproperties_followup. Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3680303) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Use the new framework to resolve the SHOW TBLPROPERTIES command. This PR along with #27243 should update all the existing V2 commands with
UnresolvedV2Relation
.Why are the changes needed?
This is a part of effort to make the relation lookup behavior consistent: SPARK-2990.
Does this PR introduce any user-facing change?
Yes
SHOW TBLPROPERTIES temp_view
now fails withAnalysisException
will be thrown with a messagetemp_view is a temp view not table
. Previously, it was returning empty row.How was this patch tested?
Existing tests