-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-4937][SQL] Normalizes conjunctions and disjunctions to eliminate common predicates #3784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #24762 has started for PR 3784 at commit
|
|
Test build #24762 has finished for PR 3784 at commit
|
|
Test FAILed. |
|
Test build #24763 has started for PR 3784 at commit
|
|
Test build #24764 has started for PR 3784 at commit
|
|
Test build #24763 has finished for PR 3784 at commit
|
|
Test FAILed. |
|
Test build #24764 has finished for PR 3784 at commit
|
|
Test FAILed. |
|
Test build #24776 has started for PR 3784 at commit
|
3cf7937 to
0e51101
Compare
|
Test build #24777 has started for PR 3784 at commit
|
0e51101 to
4ab3a58
Compare
|
Test build #24779 has started for PR 3784 at commit
|
|
Test build #24776 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Test build #24777 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Test build #24779 has finished for PR 3784 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f.transformExpressionUp instead? And this rule also can be applied to the normal expressions, not just only for expression in Filter. Move this into BooleanSimplification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, f.transformExpressionUp is really a good idea! Moving to BooleanSimplification also makes sense. Thanks for the suggestions :)
|
This is really helpful optimization, and the implementation makes sense to me, and it would be great if we apply the rules for the those non |
|
Test build #24815 has started for PR 3784 at commit
|
00aaa63 to
f4d9b8f
Compare
|
Test build #24816 has started for PR 3784 at commit
|
f4d9b8f to
601d5f6
Compare
|
Test build #24817 has started for PR 3784 at commit
|
|
Test build #24815 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Test build #24816 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Test build #24817 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Hi @liancheng, i admit my PR is more complicated, but this only cover three cases, i think we'd better adding a separate rule to optimize And/Or in sql for as many as possible cases, not mix several cases in BooleanSimplification. So I am refactorying my PR to make it more readable and clean. |
601d5f6 to
caca560
Compare
|
Test build #24821 has started for PR 3784 at commit
|
|
Test build #24821 has finished for PR 3784 at commit
|
|
Test PASSed. |
|
Hi, @liancheng, my PR originally also not limited to Filter, i used |
|
Thanks! I've merged this to master. |
This PR is a simplified version of several filter optimization rules introduced in #3778 authored by @scwf. Newly introduced optimizations include:
a && a=>aa || a=>a(a || b || c || ...) && (a || b || d || ...)=>a && b && (c || d || ...)The 3rd rule is particularly useful for optimizing the following query, which is planned into a cartesian product
to the following one, which is planned into an equi-join:
The example above is quite artificial, but common predicates are likely to appear in real life complex queries (like the one mentioned in #3778).
A difference between this PR and #3778 is that these optimizations are not limited to
Filter, but are generalized to all logical plan nodes. Thanks to @scwf for bringing up these optimizations, and @chenghao-intel for the generalization suggestion.