-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-30497][SQL] migrate DESCRIBE TABLE to the new framework #27187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Holds the name of a namespace that has yet to be looked up in a catalog. It will be resolved to | ||
* [[ResolvedNamespace]] during analysis. | ||
*/ | ||
case class UnresolvedNamespace(multipartIdentifier: Seq[String]) extends LeafNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move these new unresolve plans to one file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: v2Unresolved
seems not to fit all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about v2ResolutionPlans
?
relation: LogicalPlan, | ||
partitionSpec: TablePartitionSpec, | ||
isExtended: Boolean) extends Command { | ||
override def children: Seq[LogicalPlan] = Seq(relation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this def
is needed for all commands resolved by the new framework, we may create a trait to reduced redundancy
Test build #116611 has finished for PR 27187 at commit
|
Test build #116609 has finished for PR 27187 at commit
|
retest this please |
Test build #116614 has finished for PR 27187 at commit
|
Test build #116627 has finished for PR 27187 at commit
|
Test build #116643 has finished for PR 27187 at commit
|
CatalogV2Util.loadTable(catalog, ident) match { | ||
// We don't support view in v2 catalog yet. Here we treat v1 view as table and use | ||
// `ResolvedTable` to carry it. | ||
case Some(table) => ResolvedTable(catalog.asTableCatalog, ident, table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to add a case?
CatalogV2Util.loadTable(catalog, ident) match {
case Some(v1Table: V1Table) if v1Table.v1Table.tableType == CatalogTableType.VIEW =>
...
}
Otherwise I'm not sure why case Some(table) =>
is for v1 view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. Adding a case is clearer.
case u @ UnresolvedTableOrView(ident) => | ||
lookupTempView(ident).getOrElse(u) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it is very similar to UnresolvedRelation
?
UnresolvedRelation
can also be resolve to view by ResolveTempViews
, can be resolved to v2 relation by ResolveTables
, can be resolved to v1 relation by ResolveRelations
.
Seems we have a few similar ways to do resolving?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some commands don't need to scan the table/view (like DESCRIBE). If we use UnresolvedRelation
, then we will end up with DataSourceV2ScanRelation
, which creates Scan
from Table
and is not necessary.
Another problem is view. UnresolvedRelation
can be resolved to View
, and then get removed at the beginning of optimizer.
I'll think of how to overcome these issues and reuse UnresolvedRelation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did more investigation, and think we can't reuse UnresolvedRelation
.
UnresolvedRelation
means something we want to read/write, so it needs to resolve the data source, and fail if something goes wrong, e.g. the data source class can't be loaded.
For DESCRIBE commands, it should work as long as we can get the table. For example, we may want to use DESCRIBE to debug a table which can't load data source class.
*/ | ||
// TODO: create a generic representation for temp view, v1 view and v2 view, after we add view | ||
// support to v2 catalog. For now we only need the identifier to fallback to v1 command. | ||
case class ResolvedView(identifier: Identifier) extends LeafNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add UnresolvedView
when we migrate ALTER VIEW command. @imback82
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thanks!
Test build #116707 has finished for PR 27187 at commit
|
Test build #116708 has started for PR 27187 at commit |
Test build #116706 has finished for PR 27187 at commit
|
Test build #116715 has finished for PR 27187 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. (This is a good enhancement to the resolution framework!)
*/ | ||
// TODO: create a generic representation for temp view, v1 view and v2 view, after we add view | ||
// support to v2 catalog. For now we only need the identifier to fallback to v1 command. | ||
case class ResolvedView(identifier: Identifier) extends LeafNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thanks!
case None => u | ||
} | ||
case _ => u | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This is almost same as UnresolvedTable
case? Maybe we can do the following, or may not be worthwhile:
case u @ UnresolvedTable(identifier) =>
lookupTableOrView(u, identifier) {
u.failAnalysis(s"$ident is a view not table.")
}
case u @ UnresolvedTableOrView(identifier) =>
lookupTableOrView(u, identifier) {
ResolvedView(identifier)
}
}
private def lookupTableOrView(
unresolved: LogicalPlan, identifier: Seq[String])(viewHandler: => LogicalPlan): LogicalPlan = {
expandRelationName(identifier) match {
case SessionCatalogAndIdentifier(catalog, ident) =>
CatalogV2Util.loadTable(catalog, ident) match {
case Some(v1Table: V1Table) if v1Table.v1Table.tableType == CatalogTableType.VIEW =>
viewHandler
case Some(table) => ResolvedTable(catalog.asTableCatalog, ident, table)
case None => unresolved
}
case _ => unresolved
}
case DescribeRelation(ResolvedTable(_, ident, _: V1Table), partitionSpec, isExtended) => | ||
DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) | ||
|
||
// Use v1 command to describe temp view, as v2 catalog doesn't support view yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe (temp) view
|
||
// Use v1 command to describe temp view, as v2 catalog doesn't support view yet. | ||
case DescribeRelation(ResolvedView(ident), partitionSpec, isExtended) => | ||
DescribeTableCommand(ident.asTableIdentifier, partitionSpec, isExtended) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks clean 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #116747 has finished for PR 27187 at commit
|
retest this please |
retest this please |
Test build #116763 has finished for PR 27187 at commit
|
Test build #116784 has finished for PR 27187 at commit
|
Thanks! Merged to master |
cc @imback82 |
Thanks. I will start migrating more commands to the new framework this week. |
What changes were proposed in this pull request?
Use the new framework to resolve the DESCRIBE TABLE command.
The v1 DESCRIBE TABLE command supports both table and view. Checked with Hive and Presto, they don't have DESCRIBE TABLE syntax but only DESCRIBE, which supports both table and view:
We should make it clear that DESCRIBE support both table and view, by renaming the command to
DescribeRelation
.This PR also tunes the framework a little bit to support the case that a command accepts both table and view.
Why are the changes needed?
This is a part of effort to make the relation lookup behavior consistent: SPARK-29900.
Note that I make a separate PR here instead of #26921, as I need to update the framework to support a new use case: accept both table and view.
Does this PR introduce any user-facing change?
no
How was this patch tested?
existing tests