-
Notifications
You must be signed in to change notification settings - Fork 28.5k
[SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer #5776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Merged build triggered. |
Merged build started. |
Test build #31275 has started for PR 5776 at commit |
Test build #31275 timed out for PR 5776 at commit |
Merged build finished. Test FAILed. |
Test FAILed. |
Retest this please |
Merged build triggered. |
Merged build started. |
Test build #31286 has started for PR 5776 at commit |
Test build #31286 has finished for PR 5776 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
/cc @marmbrus |
*/ | ||
object CTESubstitution extends Rule[LogicalPlan] { | ||
def apply(plan: LogicalPlan): LogicalPlan = { | ||
val (realPlan, cteRelations) = plan match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need to first get realPlan
and creRelations
? Can we just use the following?
def apply(plan: LogicalPlan): LogicalPlan = plan match {
case With(child, relations) => substituteCTE(child, relations)
case other => other
}
Thanks for working on it. I have left a few minor comments. |
@yhuai ,thanks, updated! |
Merged build triggered. |
Merged build started. |
Test build #31379 has started for PR 5776 at commit |
withAlias.getOrElse(relation) | ||
} | ||
substituted.getOrElse(u) | ||
i.copy(table = relation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.copy(table = substituted.getOrElse(u))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, my bad, thanks
Test build #31379 has finished for PR 5776 at commit
|
Test build #774 has started for PR 5776 at commit |
Test build #774 has finished for PR 5776 at commit
|
Retest this please |
Merged build triggered. |
Merged build started. |
Test build #32101 has started for PR 5776 at commit |
Test build #32101 has finished for PR 5776 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
|
||
def substituteCTE(plan: LogicalPlan, cteRelations: Map[String, LogicalPlan]): LogicalPlan = { | ||
plan transform { | ||
case i @ InsertIntoTable(u: UnresolvedRelation, _, _, _, _) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this rule does nothing.... We are inserting to a table and this table is actually a CTE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we should get rid of this rule. I do not think this one actually works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i will delete this case
@scwf Can you also move the the rule for handling window clause (https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L668-680) to this batch? |
Merged build triggered. |
Merged build started. |
Test build #32183 has started for PR 5776 at commit |
Test build #32183 has finished for PR 5776 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
Updated done /cc @yhuai |
Thanks! Merging to master and branch 1.4. |
Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some substitution on the parsed logical plan before resolve it. Consider this two cases: 1 CTE, for cte we first build a row logical plan ``` 'With Map(q1 -> 'Subquery q1 'Project ['key] 'UnresolvedRelation [src], None) 'Project [*] 'Filter ('key = 5) 'UnresolvedRelation [q1], None ``` In `With` logicalplan here is a map stored the (`q1-> subquery`), we want first take off the with command and substitute the `q1` of `UnresolvedRelation` by the `subquery` 2 Another example is Window function, in window function user may define some windows, we also need substitute the window name of child by the concrete window. this should also done in the Substitution batch. Author: wangfei <wangfei1@huawei.com> Closes #5776 from scwf/addbatch and squashes the following commits: d4b962f [wangfei] added WindowsSubstitution 70f6932 [wangfei] Merge branch 'master' of https://github.com/apache/spark into addbatch ecaeafb [wangfei] address yhuai's comments 553005a [wangfei] fix test case 0c54798 [wangfei] address comments 29aaaaf [wangfei] fix compile 1c9a092 [wangfei] added Substitution bastch (cherry picked from commit f496bf3) Signed-off-by: Yin Huai <yhuai@databricks.com>
Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some substitution on the parsed logical plan before resolve it. Consider this two cases: 1 CTE, for cte we first build a row logical plan ``` 'With Map(q1 -> 'Subquery q1 'Project ['key] 'UnresolvedRelation [src], None) 'Project [*] 'Filter ('key = 5) 'UnresolvedRelation [q1], None ``` In `With` logicalplan here is a map stored the (`q1-> subquery`), we want first take off the with command and substitute the `q1` of `UnresolvedRelation` by the `subquery` 2 Another example is Window function, in window function user may define some windows, we also need substitute the window name of child by the concrete window. this should also done in the Substitution batch. Author: wangfei <wangfei1@huawei.com> Closes apache#5776 from scwf/addbatch and squashes the following commits: d4b962f [wangfei] added WindowsSubstitution 70f6932 [wangfei] Merge branch 'master' of https://github.com/apache/spark into addbatch ecaeafb [wangfei] address yhuai's comments 553005a [wangfei] fix test case 0c54798 [wangfei] address comments 29aaaaf [wangfei] fix compile 1c9a092 [wangfei] added Substitution bastch
Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some substitution on the parsed logical plan before resolve it. Consider this two cases: 1 CTE, for cte we first build a row logical plan ``` 'With Map(q1 -> 'Subquery q1 'Project ['key] 'UnresolvedRelation [src], None) 'Project [*] 'Filter ('key = 5) 'UnresolvedRelation [q1], None ``` In `With` logicalplan here is a map stored the (`q1-> subquery`), we want first take off the with command and substitute the `q1` of `UnresolvedRelation` by the `subquery` 2 Another example is Window function, in window function user may define some windows, we also need substitute the window name of child by the concrete window. this should also done in the Substitution batch. Author: wangfei <wangfei1@huawei.com> Closes apache#5776 from scwf/addbatch and squashes the following commits: d4b962f [wangfei] added WindowsSubstitution 70f6932 [wangfei] Merge branch 'master' of https://github.com/apache/spark into addbatch ecaeafb [wangfei] address yhuai's comments 553005a [wangfei] fix test case 0c54798 [wangfei] address comments 29aaaaf [wangfei] fix compile 1c9a092 [wangfei] added Substitution bastch
Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some substitution on the parsed logical plan before resolve it. Consider this two cases: 1 CTE, for cte we first build a row logical plan ``` 'With Map(q1 -> 'Subquery q1 'Project ['key] 'UnresolvedRelation [src], None) 'Project [*] 'Filter ('key = 5) 'UnresolvedRelation [q1], None ``` In `With` logicalplan here is a map stored the (`q1-> subquery`), we want first take off the with command and substitute the `q1` of `UnresolvedRelation` by the `subquery` 2 Another example is Window function, in window function user may define some windows, we also need substitute the window name of child by the concrete window. this should also done in the Substitution batch. Author: wangfei <wangfei1@huawei.com> Closes apache#5776 from scwf/addbatch and squashes the following commits: d4b962f [wangfei] added WindowsSubstitution 70f6932 [wangfei] Merge branch 'master' of https://github.com/apache/spark into addbatch ecaeafb [wangfei] address yhuai's comments 553005a [wangfei] fix test case 0c54798 [wangfei] address comments 29aaaaf [wangfei] fix compile 1c9a092 [wangfei] added Substitution bastch
Added a new batch named
Substitution
beforeResolution
batch. The motivation for this is there are kind of cases we want to do some substitution on the parsed logical plan before resolve it.Consider this two cases:
1 CTE, for cte we first build a row logical plan
In
With
logicalplan here is a map stored the (q1-> subquery
), we want first take off the with command and substitute theq1
ofUnresolvedRelation
by thesubquery
2 Another example is Window function, in window function user may define some windows, we also need substitute the window name of child by the concrete window. this should also done in the Substitution batch.