Skip to content

[SPARK-27747][SQL] add a logical plan link in the physical plan #24626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

It's pretty useful if we can convert a physical plan back to a logical plan, e.g., in #24389

This PR introduces a new feature to TreeNode, which allows TreeNode to carry some extra information via a mutable map, and keep the information when it's copied.

The planner leverages this feature to put the logical plan into the physical plan.

How was this patch tested?

a test suite that runs all TPCDS queries and checks that some common physical plans contain the corresponding logical plans.

@cloud-fan
Copy link
Contributor Author

cc @pengbo this is a more general approach than #24389

also cc @maryannxue @bogdanrdc @gatorsmile

@@ -620,4 +620,8 @@ class TreeNodeSuite extends SparkFunSuite with SQLHelper {
assert(planString.startsWith("Truncated plan of"))
}
}

test("tags will be carried over after copy") {

This comment was marked as resolved.

This comment was marked as resolved.

@SparkQA
Copy link

SparkQA commented May 16, 2019

Test build #105456 has finished for PR 24626 at commit d57ecc1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class TreeNodeTagName(name: String)


// The exchange related nodes are created after the planning, they don't have corresponding
// logical plan.
case _: ShuffleExchangeExec | _: BroadcastExchangeExec | _: ReusedExchangeExec =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ReusedSubqueryExec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added below

@@ -280,9 +297,15 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
rule.applyOrElse(this, identity[BaseType])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here after applying the rule, we need to carry over the tags, if needed, too.

Since carrying the tags in both if and else branches, so maybe:

val newNode = if (this fastEquals afterRuleOnChildren) {
  CurrentOrigin.withOrigin(origin) {
    rule.applyOrElse(this, identity[BaseType])
  }
} else {
  CurrentOrigin.withOrigin(origin) {
    rule.applyOrElse(afterRuleOnChildren, identity[BaseType])
  }
}

// carrying over the tags to newNode...

/**
* A mutable map for holding auxiliary information of this tree node. It will be carried over
* when this node is copied via `makeCopy`. If a user copies the tree node via other ways like the
* `copy` method, it's his responsibility to carry over the tags.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add something like The tags will be kept after transforming, if transformed to the same type, otherwise, will be dropped.

@@ -271,6 +271,8 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]] extends TreeNode[PlanT
}

object QueryPlan extends PredicateHelper {
val LOGICAL_PLAN_TAG_NAME = TreeNodeTagName("logical_plan")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some comment? For example, After query plans go through SQL planner, the planner will attach the optimized logical plan to the generated physical plan under this tag name.

case ReturnAnswer(rootPlan) => rootPlan
case _ => plan
}
p.tags += QueryPlan.LOGICAL_PLAN_TAG_NAME -> logicalPlan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the most top physical plan has the link to logical plan, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is called stackable traits in Scala. When super.plan is called, its recursive plan call points to this overridden version.

You can also see from the tests that, not only the root node has the logical plan link.

@SparkQA
Copy link

SparkQA commented May 17, 2019

Test build #105482 has finished for PR 24626 at commit 9f377d7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented May 17, 2019

retest this please.

@SparkQA
Copy link

SparkQA commented May 17, 2019

Test build #105485 has finished for PR 24626 at commit 9f377d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 17, 2019

Test build #105497 has finished for PR 24626 at commit dafad5c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@pengbo
Copy link
Contributor

pengbo commented May 18, 2019

After changing to always carry over the tags in transform, it seems that the unit test has to be adjusted as well. I have created a PR cloud-fan#13 if that helps.
Can you please inform the motivation of that change?
Thanks

@maryannxue
Copy link
Contributor

LGTM, pending Jenkins.

@SparkQA
Copy link

SparkQA commented May 20, 2019

Test build #105548 has finished for PR 24626 at commit b033f55.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 20, 2019

Test build #105549 has finished for PR 24626 at commit b380f1d.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@pengbo
Copy link
Contributor

pengbo commented May 20, 2019

retest this please.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 20, 2019

Test build #105572 has finished for PR 24626 at commit b380f1d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master.

j-baker pushed a commit to palantir/spark that referenced this pull request Jan 25, 2020
It's pretty useful if we can convert a physical plan back to a logical plan, e.g., in apache#24389

This PR introduces a new feature to `TreeNode`, which allows `TreeNode` to carry some extra information via a mutable map, and keep the information when it's copied.

The planner leverages this feature to put the logical plan into the physical plan.

a test suite that runs all TPCDS queries and checks that some common physical plans contain the corresponding logical plans.

Closes apache#24626 from cloud-fan/link.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Peng Bo <bo.peng1019@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
cloud-fan added a commit to cloud-fan/spark that referenced this pull request Jan 31, 2020
It's pretty useful if we can convert a physical plan back to a logical plan, e.g., in apache#24389

This PR introduces a new feature to `TreeNode`, which allows `TreeNode` to carry some extra information via a mutable map, and keep the information when it's copied.

The planner leverages this feature to put the logical plan into the physical plan.

a test suite that runs all TPCDS queries and checks that some common physical plans contain the corresponding logical plans.

Closes apache#24626 from cloud-fan/link.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Peng Bo <bo.peng1019@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants