Skip to content

[SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 #37588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 44 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Aug 20, 2022

What changes were proposed in this pull request?

The pr aim to implement v2 SHOW TABLE EXTENDED as ShowTableExec

Why are the changes needed?

To have feature parity with the datasource V1.

Does this PR introduce any user-facing change?

Yes, Support SHOW TABLE EXTENDED in v2.

How was this patch tested?

Add new UT.
By running the unified tests for v2 implementation:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *ShowTablesSuite"
$ build/sbt "test:testOnly *ShowTablesSuite"

@github-actions github-actions bot added the SQL label Aug 20, 2022
@panbingkun panbingkun changed the title [SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 [Don't Review][TEST][SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 Aug 20, 2022
@panbingkun panbingkun changed the title [Don't Review][TEST][SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 [SPARK-33393][SQL] Support SHOW TABLE EXTENDED in v2 Aug 20, 2022
@panbingkun
Copy link
Contributor Author

cc @MaxGekk

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@panbingkun panbingkun requested a review from MaxGekk August 28, 2022 10:57
@Fokko
Copy link
Contributor

Fokko commented Nov 28, 2022

@panbingkun @MaxGekk Any progress on this? Thanks!

@ja-michel
Copy link

+1

1 similar comment
@rshanmugam1
Copy link

+1

@MaxGekk
Copy link
Member

MaxGekk commented Feb 1, 2023

@panbingkun Could you resolve conflicts when you have time, please.

@panbingkun
Copy link
Contributor Author

@panbingkun Could you resolve conflicts when you have time, please.

Done

@xkrogen
Copy link
Contributor

xkrogen commented Feb 13, 2023

@MaxGekk are you able to take another look at this PR? Would love to see it go in, this is an annoying feature gap that we need to work around currently for DSv2 sources :(

@MaxGekk
Copy link
Member

MaxGekk commented Feb 13, 2023

are you able to take another look at this PR?

No, I haven't looked at this yet because of my wedding. :-)

this is an annoying feature gap that we need to work around currently for DSv2 sources :(

@xkrogen If you really need this feature, please, review this PR. I will join to you slightly later.

pattern: Option[String]) extends V2CommandExec with LeafExecNode {
pattern: Option[String],
isExtended: Boolean = false,
partitionSpec: Option[TablePartitionSpec] = None) extends V2CommandExec with LeafExecNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TablePartitionSpec is legacy one, can't you use ResolvedPartitionSpec? For example, see

partitionSpec: Option[ResolvedPartitionSpec]) extends V2CommandExec with LeafExecNode {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// "Partition Values"
val partitionSchema = partitionTable.partitionSchema()
val normalizedSpec = normalizePartitionSpec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is not needed. The job should be done by ResolvePartitionSpec.resolvePartitionSpec, or there is some reason to bypass it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the latest version has eliminated the above logic.

requireExactMatchedPartitionSpec(identifier.toString,
normalizedSpec, partitionSchema.fieldNames)

val partitionNames = normalizedSpec.keySet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

convertToPartIdent(normalizedSpec, partitionSchema))
val partitionIdentifiers = partitionTable.listPartitionIdentifiers(names.toArray, ident)
partitionIdentifiers.length match {
case 0 =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions extendedPartition() is invoked only for non-empty partition spec as I can see, or not? Is there any test for this case?

case 0 =>
throw QueryExecutionErrors.notExistPartitionError(
identifier.toString, ident, partitionSchema)
case len if (len > 1) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
case len if (len > 1) =>
case len if len > 1 =>

Comment on lines 171 to 172
var i = 0
while (i < len) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop:

    var i = 0
    while (i < len) {

      i += 1
    }

can be simplified by:

    for (i <- 0 until len) {

    }

@panbingkun panbingkun requested a review from MaxGekk February 20, 2023 11:45
catalog: String,
namespace: String,
table: String): (String, Map[String, String]) = {
("_LEGACY_ERROR_TEMP_1251",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how hard it is to unify this error between v1 and v2 tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • _LEGACY_ERROR_TEMP_1231(QueryCompilationErrors#invalidPartitionColumnKeyInTableError), there are approximately 13 code point to call it:
    image

    "_LEGACY_ERROR_TEMP_1231" : {
    "message" : [
    "<key> is not a valid partition column in table <tblName>."
    ]
    },

  • _LEGACY_ERROR_TEMP_1251(QueryCompilationErrors#actionNotAllowedOnTableSincePartitionMetadataNotStoredError), there are approximately 8 code point to call it:
    image

    "_LEGACY_ERROR_TEMP_1251" : {
    "message" : [
    "<action> is not allowed on <tableName> since its partition metadata is not stored in the Hive metastore. To import this information into the metastore, run `msck repair table <tableName>`."
    ]
    },

Because there are many scopes involved, in order to reduce the interference of logic on this PR, I suggest doing the merging or unification in a new separate PR.
Do you think this is appropriate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make senses, let's do it in followup

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pretty close now, thanks for your patience!

@panbingkun
Copy link
Contributor Author

panbingkun commented Nov 2, 2023

I think it's pretty close now, thanks for your patience!

I will update it today.
Thank you very much for your patience and carefulness, which has given me great help! ❤️

@panbingkun panbingkun requested a review from cloud-fan November 3, 2023 13:12
val globalTempViews = if (dbName == globalTempViewManager.database) {
globalTempViewManager.listViewNames(pattern).map { viewName =>
globalTempViewManager.get(viewName).map(_.tableMeta).getOrElse(
throw new NoSuchTableException(globalTempViewManager.database, viewName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to throw this error during listing views. Shall we use flatMap and just skip the temp views that were deleted immediately after globalTempViewManager.listViewNames?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay


val localTempViews = listLocalTempViews(pattern).map { viewIndent =>
tempViews.get(viewIndent.table).map(_.tableMeta).getOrElse(
throw new NoSuchTableException(viewIndent.database.getOrElse(""), viewIndent.table))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

*/
override def visitShowTableExtended(
ctx: ShowTableExtendedContext): LogicalPlan = withOrigin(ctx) {
val partitionKeys = Option(ctx.partitionSpec).map { specCtx =>
UnresolvedPartitionSpec(visitNonOptionalPartitionSpec(specCtx), None)
@inline def createUnresolvedTable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not this kind of inline... we can remove this function and put the code in where we call the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay,I misunderstood the meaning, haha

@github-actions github-actions bot removed the BUILD label Nov 6, 2023
@panbingkun panbingkun requested a review from cloud-fan November 6, 2023 12:02
Comment on lines 1097 to 1098
val dbName = format(db)
val globalTempViews = if (dbName == globalTempViewManager.database) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val dbName = format(db)
val globalTempViews = if (dbName == globalTempViewManager.database) {
val globalTempViews = if (format(db) == globalTempViewManager.database) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (table.supportsPartitions && table.asPartitionable.partitionSchema().nonEmpty) {
partitionColumns = table.asPartitionable.partitionSchema()
results.put("Partition Provider", "Catalog")
results.put("Partition Columns", table.asPartitionable.partitionSchema().map(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
results.put("Partition Columns", table.asPartitionable.partitionSchema().map(
results.put("Partition Columns", partitionColumns.map(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@cloud-fan
Copy link
Contributor

unfortunately this has conflicts now...

@panbingkun
Copy link
Contributor Author

unfortunately this has conflicts now...

Done, I have resolved these conflicts.

@panbingkun
Copy link
Contributor Author

panbingkun commented Nov 16, 2023

@cloud-fan If you have time, could you please take a look at this PR? Thank you very much!

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in c91dbde Nov 16, 2023
@panbingkun
Copy link
Contributor Author

panbingkun commented Nov 16, 2023

thanks, merging to master!

Thank for all reviewing and great help again @cloud-fan @MaxGekk @beliefer @LuciferYang ❤️❤️❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants