From d90b47918504bf9cec99350b68f62aee9686fa70 Mon Sep 17 00:00:00 2001 From: ulysses-you Date: Wed, 22 Sep 2021 17:25:56 +0800 Subject: [PATCH] [SPARK-36822][SQL] BroadcastNestedLoopJoinExec should use all condition instead of non-equi condition ### What changes were proposed in this pull request? Change `nonEquiCond` to all join condition at `JoinSelection.ExtractEquiJoinKeys` pattern. ### Why are the changes needed? At `JoinSelection`, with `ExtractEquiJoinKeys`, we use `nonEquiCond` as the join condition. It's wrong since there should exist some equi condition. ``` Seq(joins.BroadcastNestedLoopJoinExec( planLater(left), planLater(right), buildSide, joinType, nonEquiCond)) ``` But it's should not be a bug, since we always use the smj as the default join strategy for ExtractEquiJoinKeys. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? it's not a bug for now, but we should fix it in case we use this code path in future. Closes #34065 from ulysses-you/join-condition. Authored-by: ulysses-you Signed-off-by: Wenchen Fan --- .../scala/org/apache/spark/sql/execution/SparkStrategies.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala index 307a2bc57cb6f..cdecda4f9e256 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala @@ -262,7 +262,7 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] { // This join could be very slow or OOM val buildSide = getSmallerSide(left, right) Seq(joins.BroadcastNestedLoopJoinExec( - planLater(left), planLater(right), buildSide, joinType, nonEquiCond)) + planLater(left), planLater(right), buildSide, joinType, j.condition)) } }