[SPARK-38132][SQL] Remove NotPropagation rule#35428
[SPARK-38132][SQL] Remove NotPropagation rule#35428kazuyukitanimura wants to merge 6 commits intoapache:masterfrom
NotPropagation rule#35428Conversation
|
For unblocking #35395 |
|
cc @sunchao too since he was on the original PR. |
There was a problem hiding this comment.
BTW, @kazuyukitanimura .
- Instead of a pure removal, we need a new test case to prevent future regressions like this.
- If possible, make a new JIRA ID for this PR because the original one is 3 months ago already (although it's not released yet)
viirya
left a comment
There was a problem hiding this comment.
It's okay to remove it completely. I agree with @dongjoon-hyun that we might still need the test case to guard the regression.
|
Thank you @dongjoon-hyun @viirya for the feedback. Added NotInSubqueryEndToEndSuite.scala |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Thank you for updates, @kazuyukitanimura .
sunchao
left a comment
There was a problem hiding this comment.
Looks reasonable. Maybe we can come back to revisit this when we find more practical use cases for this optimization.
|
|
||
| val t = "test_table" | ||
|
|
||
| test("SPARK-38132: Avoid Optimizing Not(InSubquery)") { |
There was a problem hiding this comment.
nit: "Avoid Optimizing Not(InSubquery)" -> "Avoid optimizing Not IN subquery"?
|
|
||
| import org.apache.spark.sql.test.SharedSparkSession | ||
|
|
||
| class NotInSubqueryEndToEndSuite extends QueryTest with SharedSparkSession { |
There was a problem hiding this comment.
Hmm, I think we already have test suite e.g. SubquerySuite. Can we just put the test into existing test suite?
| assert(!nonDeterministicQueryPlan.deterministic) | ||
| } | ||
|
|
||
| test("SPARK-38132: Avoid optimizing Not IN subquery") { |
There was a problem hiding this comment.
Shall we give a more meaningful name for this test case?
For me, the test case looks like checking the correctness of the queries which is irrelevant to avoid optimizing something.
There was a problem hiding this comment.
Thanks, renamed
NotPropagation rule
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM. Thank you, @kazuyukitanimura , @viirya , @sunchao
Merged to master.
|
Thanks for pinging me. Just checking for my own understanding - so is this technically a revert of SPARK-36665? |
|
Late LGTM. It seems to me that this rule won't make a big deal to the overall performance, so it's better to make it very simple, otherwise it's not worthwhile. |
@HyukjinKwon It is actually not a full revert. SPARK-36665 introduced two classes(objects), and reverted only one of them |
What changes were proposed in this pull request?
This is a follow-up PR to mitigate the bug introduced by SPARK-36665. This PR removes
NotPropagationoptimization for now until we find a better approach.Why are the changes needed?
NotPropagationoptimization previously brokeRewritePredicateSubqueryso that it does not properly rewrite the predicate to a NULL-aware left anti join anymore.Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests