-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-23079][SQL]Fix query constraints propagation with aliases #20278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #86181 has started for PR 20278 at commit |
retest this please |
Test build #86216 has finished for PR 20278 at commit
|
@@ -160,64 +161,6 @@ class InferFiltersFromConstraintsSuite extends PlanTest { | |||
comparePlans(optimized, correctAnswer) | |||
} | |||
|
|||
test("inner join with alias: don't generate constraints for recursive functions") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test cases are invalid after the code changes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
@@ -34,6 +34,7 @@ class InferFiltersFromConstraintsSuite extends PlanTest { | |||
PushDownPredicate, | |||
InferFiltersFromConstraints, | |||
CombineFilters, | |||
SimplifyBinaryComparison, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this rule, the test cases will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just different from the expected result because of this line:
https://github.com/gengliangwang/spark/blob/c02d9b4bdccafcdf4008dcdd4c2ad9509c9acd96/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L258
These predicates can be simplified by SimplifyBinaryComparison
. So no need to modified the expected test results .
LGTM, merging to master/2.3! |
## What changes were proposed in this pull request? Previously, PR #19201 fix the problem of non-converging constraints. After that PR #19149 improve the loop and constraints is inferred only once. So the problem of non-converging constraints is gone. However, the case below will fail. ``` spark.range(5).write.saveAsTable("t") val t = spark.read.table("t") val left = t.withColumn("xid", $"id" + lit(1)).as("x") val right = t.withColumnRenamed("id", "xid").as("y") val df = left.join(right, "xid").filter("id = 3").toDF() checkAnswer(df, Row(4, 3)) ``` Because `aliasMap` replace all the aliased child. See the test case in PR for details. This PR is to fix this bug by removing useless code for preventing non-converging constraints. It can be also fixed with #20270, but this is much simpler and clean up the code. ## How was this patch tested? Unit test Author: Wang Gengliang <ltnwgl@gmail.com> Closes #20278 from gengliangwang/FixConstraintSimple. (cherry picked from commit 8598a98) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Previously, PR #19201 fix the problem of non-converging constraints.
After that PR #19149 improve the loop and constraints is inferred only once.
So the problem of non-converging constraints is gone.
However, the case below will fail.
Because
aliasMap
replace all the aliased child. See the test case in PR for details.This PR is to fix this bug by removing useless code for preventing non-converging constraints.
It can be also fixed with #20270, but this is much simpler and clean up the code.
How was this patch tested?
Unit test