-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-5642] [SQL] Apply column pruning on unused aggregation fields #4415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #26889 has started for PR 4415 at commit
|
Test build #26889 has finished for PR 4415 at commit
|
Test FAILed. |
Test build #26898 has started for PR 4415 at commit
|
Test build #26898 has finished for PR 4415 at commit
|
Test PASSed. |
Is it possible to add a unit test? |
(I understand unit test coverage for the optimizer is pretty low - but that would be great to change & increase) |
Test build #27080 has started for PR 4415 at commit
|
Test build #27080 has finished for PR 4415 at commit
|
Test PASSed. |
@@ -116,6 +116,10 @@ object ColumnPruning extends Rule[LogicalPlan] { | |||
case a @ Aggregate(_, _, child) if (child.outputSet -- a.references).nonEmpty => | |||
a.copy(child = Project(a.references.toSeq, child)) | |||
|
|||
case p @ Project(pl, Aggregate(ge, ec, child)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd follow the convention from the rest of the optimizer and use longer variable names (i.e. projectList
, groupExprs
, aggExprs
)
Test build #27163 has started for PR 4415 at commit
|
2cd88ff
to
2916ec9
Compare
Test build #27165 has started for PR 4415 at commit
|
Test build #27163 has finished for PR 4415 at commit
|
Test PASSed. |
Test build #27165 has finished for PR 4415 at commit
|
Test PASSed. |
2916ec9
to
5d2d8a3
Compare
Test build #27430 has started for PR 4415 at commit
|
Test build #27430 has finished for PR 4415 at commit
|
Test PASSed. |
Thanks! I'm going to go ahead and add one more test case while merging since I wasn't sure if this case was handled properly (it is :) ) Merging to master and 1.3 |
test("column pruning for group with alias") {
val originalQuery =
testRelation
.groupBy('a)('a as 'c, Count('b))
.select('c)
val optimized = Optimize(originalQuery.analyze)
val correctAnswer =
testRelation
.select('a)
.groupBy('a)('a as 'c)
.select('c).analyze
comparePlans(optimized, correctAnswer)
} |
select k from (select key k, max(value) v from src group by k) t Author: Daoyuan Wang <daoyuan.wang@intel.com> Author: Michael Armbrust <michael@databricks.com> Closes #4415 from adrian-wang/groupprune and squashes the following commits: 5d2d8a3 [Daoyuan Wang] address Michael's comments 61f8ef7 [Daoyuan Wang] add a unit test 80ddcc6 [Daoyuan Wang] keep project b69d385 [Daoyuan Wang] add a prune rule for grouping set (cherry picked from commit 2cbb3e4) Signed-off-by: Michael Armbrust <michael@databricks.com>
select k from (select key k, max(value) v from src group by k) t