Make benefits_from_input_partitioning Default in SHJ #8801
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Improves the condition on #8794 (comment) #.
Rationale for this change
Previous
SymmetricHashJoinExec
implementation did not require input ordering, thus this makesSymmetricHashJoinExec
suboptimal whentarget_partitions
is higher than one.What changes are included in this PR?
If the child nodes (left or right side of the join) already have a defined order and the columns used in the filter predicate are ordered, the order of that side is kept. The identified order is then used in the
SymmetricHashJoinExec
to maintain bounded memory during join operations. However, if the child nodes do not have an inherent order, or if the filter columns are unordered, no specific order is required for the SymmetricHashJoinExec. This approach ensures that the symmetric hash join operation only imposes ordering constraints when necessary, based on the properties of the child nodes and the filter condition.Also, proto files are changed, which increases the changed line count.
Are these changes tested?
Yes
Are there any user-facing changes?
No.