Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make benefits_from_input_partitioning Default in SHJ #8801

Merged
merged 3 commits into from
Jan 10, 2024
Merged

Make benefits_from_input_partitioning Default in SHJ #8801

merged 3 commits into from
Jan 10, 2024

Conversation

metesynnada
Copy link
Contributor

Which issue does this PR close?

Improves the condition on #8794 (comment) #.

Rationale for this change

Previous SymmetricHashJoinExec implementation did not require input ordering, thus this makes SymmetricHashJoinExec suboptimal when target_partitions is higher than one.

What changes are included in this PR?

If the child nodes (left or right side of the join) already have a defined order and the columns used in the filter predicate are ordered, the order of that side is kept. The identified order is then used in the SymmetricHashJoinExec to maintain bounded memory during join operations. However, if the child nodes do not have an inherent order, or if the filter columns are unordered, no specific order is required for the SymmetricHashJoinExec. This approach ensures that the symmetric hash join operation only imposes ordering constraints when necessary, based on the properties of the child nodes and the filter condition.

Also, proto files are changed, which increases the changed line count.

Are these changes tested?

Yes

Are there any user-facing changes?

No.

@github-actions github-actions bot added the core Core DataFusion crate label Jan 9, 2024
Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!. Thanks for this support.

@alamb alamb merged commit 78d3314 into apache:main Jan 10, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants