-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 #37588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @MaxGekk |
Can one of the admins verify this patch? |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolvePartitionSpec.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolvePartitionSpec.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala
Outdated
Show resolved
Hide resolved
@panbingkun @MaxGekk Any progress on this? Thanks! |
+1 |
1 similar comment
+1 |
@panbingkun Could you resolve conflicts when you have time, please. |
Done |
@MaxGekk are you able to take another look at this PR? Would love to see it go in, this is an annoying feature gap that we need to work around currently for DSv2 sources :( |
No, I haven't looked at this yet because of my wedding. :-)
@xkrogen If you really need this feature, please, review this PR. I will join to you slightly later. |
pattern: Option[String]) extends V2CommandExec with LeafExecNode { | ||
pattern: Option[String], | ||
isExtended: Boolean = false, | ||
partitionSpec: Option[TablePartitionSpec] = None) extends V2CommandExec with LeafExecNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TablePartitionSpec
is legacy one, can't you use ResolvedPartitionSpec
? For example, see
spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ShowPartitionsExec.scala
Line 36 in 0494dc9
partitionSpec: Option[ResolvedPartitionSpec]) extends V2CommandExec with LeafExecNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
// "Partition Values" | ||
val partitionSchema = partitionTable.partitionSchema() | ||
val normalizedSpec = normalizePartitionSpec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is not needed. The job should be done by ResolvePartitionSpec.resolvePartitionSpec
, or there is some reason to bypass it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the latest version has eliminated the above logic.
requireExactMatchedPartitionSpec(identifier.toString, | ||
normalizedSpec, partitionSchema.fieldNames) | ||
|
||
val partitionNames = normalizedSpec.keySet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
convertToPartIdent(normalizedSpec, partitionSchema)) | ||
val partitionIdentifiers = partitionTable.listPartitionIdentifiers(names.toArray, ident) | ||
partitionIdentifiers.length match { | ||
case 0 => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions extendedPartition()
is invoked only for non-empty partition spec as I can see, or not? Is there any test for this case?
case 0 => | ||
throw QueryExecutionErrors.notExistPartitionError( | ||
identifier.toString, ident, partitionSchema) | ||
case len if (len > 1) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
case len if (len > 1) => | |
case len if len > 1 => |
var i = 0 | ||
while (i < len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop:
var i = 0
while (i < len) {
i += 1
}
can be simplified by:
for (i <- 0 until len) {
}
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuiteBase.scala
Outdated
Show resolved
Hide resolved
catalog: String, | ||
namespace: String, | ||
table: String): (String, Map[String, String]) = { | ||
("_LEGACY_ERROR_TEMP_1251", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how hard it is to unify this error between v1 and v2 tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
_LEGACY_ERROR_TEMP_1231(QueryCompilationErrors#invalidPartitionColumnKeyInTableError), there are approximately
13
code point to call it:
spark/common/utils/src/main/resources/error/error-classes.json
Lines 4815 to 4819 in 1359c13
"_LEGACY_ERROR_TEMP_1231" : { "message" : [ "<key> is not a valid partition column in table <tblName>." ] }, -
_LEGACY_ERROR_TEMP_1251(QueryCompilationErrors#actionNotAllowedOnTableSincePartitionMetadataNotStoredError), there are approximately
8
code point to call it:
spark/common/utils/src/main/resources/error/error-classes.json
Lines 4860 to 4864 in 1359c13
"_LEGACY_ERROR_TEMP_1251" : { "message" : [ "<action> is not allowed on <tableName> since its partition metadata is not stored in the Hive metastore. To import this information into the metastore, run `msck repair table <tableName>`." ] },
Because there are many scopes involved, in order to reduce the interference of logic on this PR, I suggest doing the merging or unification
in a new separate PR.
Do you think this is appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make senses, let's do it in followup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty close now, thanks for your patience!
I will update it today. |
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
val globalTempViews = if (dbName == globalTempViewManager.database) { | ||
globalTempViewManager.listViewNames(pattern).map { viewName => | ||
globalTempViewManager.get(viewName).map(_.tableMeta).getOrElse( | ||
throw new NoSuchTableException(globalTempViewManager.database, viewName)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird to throw this error during listing views. Shall we use flatMap
and just skip the temp views that were deleted immediately after globalTempViewManager.listViewNames
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
|
||
val localTempViews = listLocalTempViews(pattern).map { viewIndent => | ||
tempViews.get(viewIndent.table).map(_.tableMeta).getOrElse( | ||
throw new NoSuchTableException(viewIndent.database.getOrElse(""), viewIndent.table)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
*/ | ||
override def visitShowTableExtended( | ||
ctx: ShowTableExtendedContext): LogicalPlan = withOrigin(ctx) { | ||
val partitionKeys = Option(ctx.partitionSpec).map { specCtx => | ||
UnresolvedPartitionSpec(visitNonOptionalPartitionSpec(specCtx), None) | ||
@inline def createUnresolvedTable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not this kind of inline... we can remove this function and put the code in where we call the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay,I misunderstood the meaning, haha
val dbName = format(db) | ||
val globalTempViews = if (dbName == globalTempViewManager.database) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val dbName = format(db) | |
val globalTempViews = if (dbName == globalTempViewManager.database) { | |
val globalTempViews = if (format(db) == globalTempViewManager.database) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if (table.supportsPartitions && table.asPartitionable.partitionSchema().nonEmpty) { | ||
partitionColumns = table.asPartitionable.partitionSchema() | ||
results.put("Partition Provider", "Catalog") | ||
results.put("Partition Columns", table.asPartitionable.partitionSchema().map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
results.put("Partition Columns", table.asPartitionable.partitionSchema().map( | |
results.put("Partition Columns", partitionColumns.map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
unfortunately this has conflicts now... |
Done, I have resolved these conflicts. |
@cloud-fan If you have time, could you please take a look at this PR? Thank you very much! |
thanks, merging to master! |
Thank for all reviewing and great help again @cloud-fan @MaxGekk @beliefer @LuciferYang ❤️❤️❤️ |
What changes were proposed in this pull request?
The pr aim to implement v2 SHOW TABLE EXTENDED as
ShowTableExec
Why are the changes needed?
To have feature parity with the datasource V1.
Does this PR introduce any user-facing change?
Yes, Support SHOW TABLE EXTENDED in v2.
How was this patch tested?
Add new UT.
By running the unified tests for v2 implementation: