-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: optimize (left outer) (anti) semi join which has no other condition #47764
Conversation
Signed-off-by: gengliqi <gengliqiii@gmail.com>
/cc @windtalker |
Hi @gengliqi. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/check-issue-triage-complete |
/retest |
@gengliqi: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #47764 +/- ##
================================================
+ Coverage 71.9012% 72.7815% +0.8802%
================================================
Files 1398 1422 +24
Lines 405200 411908 +6708
================================================
+ Hits 291344 299793 +8449
+ Misses 94244 93239 -1005
+ Partials 19612 18876 -736
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/cc @XuHuaiyu |
@@ -490,6 +499,10 @@ func (naaj *nullAwareAntiSemiJoiner) onMissMatch(_ bool, outer chunk.Row, chk *c | |||
chk.AppendRowByColIdxs(outer, naaj.lUsed) | |||
} | |||
|
|||
func (naaj *nullAwareAntiSemiJoiner) isSemiJoinWithoutCondition() bool { | |||
return len(naaj.conditions) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this optimazation works for nullaware join? And since only joinMatchedProbeSideRow2Chunk
is check isSemiJoinWithoutCondition
, nullaware join is not optimized even if it return true
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This function is useless for null-aware semi join.
Signed-off-by: gengliqi <gengliqiii@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -565,7 +611,7 @@ func (es *entryStore) GetStore() (e *entry, memDelta int64) { | |||
|
|||
type baseHashTable interface { | |||
Put(hashKey uint64, rowPtr chunk.RowPtr) | |||
Get(hashKey uint64) (rowPtrs []chunk.RowPtr) | |||
Get(hashKey uint64) *entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to add a comment example here for how to iterator the stored ptrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Nice suggestion.
Signed-off-by: gengliqi <gengliqiii@gmail.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: windtalker, XuHuaiyu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: close #47424
Problem Summary:
The sql will be very slow if there are a lot of matched row for each row from build side in (left outer) (anti) semi join.
What is changed and how it works?
Recognize this case then optimize it.
Check List
Tests
Before
After
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.