-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: support expand IN
expr
#35699
base: master
Are you sure you want to change the base?
Conversation
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Welcome @Yriuns! |
/cc @winoros |
835b5ff
to
a228b6c
Compare
Co-authored-by: Chengpeng Yan <41809508+Reminiscent@users.noreply.github.com>
/cc @time-and-fate |
@Yriuns: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What problem does this PR solve?
Issue Number: close #34882
Problem Summary:
Provide a hint that allows optimizer to expand the in expression.
A selection like
a = 0 AND b = 1 AND c IN (2, 3)
will be convert to something like this:So that the we can utilize the index better, e.g.,
TopN
push down can be optimized asLimitN
.What is changed and how it works?
For
WHERE
clause withIN
expressions, we divide the expressions list intonon-IN
expressions list andIN
expressions list. For the lists ofIN
expressions, we do a Cartesian product. Each element of the Cartesian product result set is merged with the lists ofnon-IN
expressions to obtain a newWHERE
clause. Finally, we use aUNION ALL
to merge all these newSelection
operators.An example:
Check List
Tests
As we can see, after we use
IN_EXPANSION()
hint, theTopN_19
to tikv converts to 2Limit
to tikv.with-hint
plan only needs to scan at most 10 + 10 records, butwithout-hint
plan needs to scan and sort all the records of ranges[0 1 2,0 1 2], [0 1 3,0 1 3]
. If there are a lot of records of these ranges, we can save a lot of uncessary cost.I also conduct a micro benchmark(1 concurrency of client) in my own server. Here is the result:
Without hint
With hint
The latency is quite stable, and the more records, the more time we save.
Side effects
tidb
, but save CPU oftikv
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.