NestedLoopJoin Projection Pushdown#14120
Conversation
Signed-off-by: Jay Zhan <jay.zhan@synnada.ai>
Signed-off-by: Jay Zhan <jay.zhan@synnada.ai>
berkaysynnada
left a comment
There was a problem hiding this comment.
Thanks @jayzhan-synnada. This PR clearly brings an improvement, but I have a suggestion to further optimize projections. It will be even better if we handle mentioned cases as well.
| try_swapping_with_cross_join(projection, cross_join)? | ||
| } else if let Some(nl_join) = input.downcast_ref::<NestedLoopJoinExec>() { | ||
| try_swapping_with_nested_loop_join(projection, nl_join)? | ||
| try_pushdown_through_nested_loop_join(projection, nl_join)?.map_or_else( |
There was a problem hiding this comment.
I know the same pattern exists in HashJoin part too, but does it take much effort if we somehow unify this "first try pushdown, if not possible, then embed" approach? Like "embed and pushdown", whichever possible.
| indices | ||
| } | ||
|
|
||
| fn try_pushdown_through_nested_loop_join( |
There was a problem hiding this comment.
What I mean is, can we do the possible pushdown and embed operation at the same time in this function
| let expected = [ | ||
| "NestedLoopJoinExec: join_type=Inner, filter=a@0 < b@1, projection=[c@2]", | ||
| " CsvExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], has_header=false", | ||
| " CsvExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], has_header=false", |
There was a problem hiding this comment.
A solid example of what I am looking for is, these plans will project a&c (first CsvExec), and b only (second CsvExec). Embedding the projection into CsvExec is already done, we just need to pushdown the projection below NestedLoopJoinExec
|
I'll merge this once the conflicts are resolved, and @jayzhan-synnada will create a ticket for the item that drives projection optimizations further |
|
Thanks @jayzhan-synnada |
Which issue does this PR close?
Closes #.
Rationale for this change
Similar to HashJoin, we also need projection pushdown for nested loop join
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?