-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Fix dynamic filter pushdown in HashJoinExec #17201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
2f1ebca
Fix dynamic filter pushdown in HashJoinExec::swap_inputs
adriangb 53f7b72
add slt test
adriangb 41636f1
alternative approach: move dynamic filter creation to later in optimi…
adriangb b51ecff
fix Debug implementation
adriangb db71eea
fix lints
adriangb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the
DynamicFilterPhysicalExprbe applied to the left table here, i.e.,small_table?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that at runtime (i.e. when
ExecutionPlan::executeis called and work actually begins) the left side is always the build side and the right side is always the probe side. During optimizer passes which one is left and right may be swapped / re-arranged but all of the dynamic filter stuff happens after this so we can always push the filters into the right side. At least for inner joins, other join types may be more complex and I haven't even begun to wrap my head around those cases.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tree looks like this:
This makes sense: you always want the small table to be the build side and the large table to be the probe side.
If I change the query to:
Then we'll make a different plan but we swap the join around so that the large table continues to be the probe side:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in this case
large_tableis filtered by bothv >= 50and the dynamic filter. Shouldn't the dynamic filter be applied tosmall_tableinstead?If
where v = 50is executed instead, the dynamic filter now appears in thesmall_tabletable:Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynamic filter is always applied to the probe side. In those examples above it seems the join side changes, probably because
v = 50is highly selective so the large table effectively becomes ~ 1 row and thus can be used for the build side. You can see that even without dynamic filters changing the query like that flips the build/probe sides:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so the filter is always applied from the build side to the probe side, independently of the query. Maybe I misunderstood but I though that was the original issue (#17196), that the filter was always applied to the same side, even when it made sense to do the opposite.
For example:
In this example
t2is filtered byvand by a dynamic filter onk, while in theory it would be faster to apply a dynamic filter fromt2tot1(maybe we would need some heuristic to determine which would be the best approach?).