Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: improve estimation for index equal condition #17366

Merged
merged 4 commits into from
Jun 3, 2020

Conversation

eurekaka
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #17364

Problem Summary:

Wrong plan is chosen when index equal condition contains value which is out of the range of the histogram.

What is changed and how it works?

What's Changed:

Compute the NDV for the prefix columns of the index which are used for equal condition, and use it for row count guess.

How it Works:

It is more reasonable than hard-coded constant outOfRangeBetweenRate.

Related changes

N/A

Check List

Tests

  • Unit test

Side effects

N/A

Release note

  • improve row count estimation for index equal condition

@eurekaka eurekaka added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner component/statistics labels May 22, 2020
@eurekaka eurekaka requested a review from a team as a code owner May 22, 2020 10:12
@ghost ghost requested review from winoros and removed request for a team May 22, 2020 10:13
@eurekaka
Copy link
Contributor Author

/run-unit-test

planner/core/testdata/analyze_suite_out.json Outdated Show resolved Hide resolved
if i >= usedColsLen {
break
}
ndv = mathutil.MaxInt64(ndv, coll.Columns[colID].NDV)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the col1.ndv * col2.ndv * ... instead of max of them?

Copy link
Contributor Author

@eurekaka eurekaka May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the 2 rows are like (1,1),(2,2), the NDV is 2 not 4. max is lower bound, ndv * ndv is upper bound, using max is consistent with getCardinality().

@codecov
Copy link

codecov bot commented May 25, 2020

Codecov Report

Merging #17366 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #17366   +/-   ##
===========================================
  Coverage   80.0259%   80.0259%           
===========================================
  Files           521        521           
  Lines        141614     141614           
===========================================
  Hits         113328     113328           
  Misses        19266      19266           
  Partials       9020       9020           

@eurekaka eurekaka requested a review from lzmhhh123 May 25, 2020 09:13
Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lzmhhh123 lzmhhh123 added the status/LGT1 Indicates that a PR has LGTM 1. label May 25, 2020
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jun 2, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Jun 2, 2020

Your auto merge job has been accepted, waiting for:

  • 17516
  • 17451

@sre-bot
Copy link
Contributor

sre-bot commented Jun 2, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Jun 2, 2020

@eurekaka merge failed.

@zz-jason zz-jason merged commit 94a722e into pingcap:master Jun 3, 2020
sre-bot pushed a commit to sre-bot/tidb that referenced this pull request Jun 3, 2020
Signed-off-by: sre-bot <sre-bot@pingcap.com>
@sre-bot
Copy link
Contributor

sre-bot commented Jun 3, 2020

cherry pick to release-3.0 in PR #17609

sre-bot pushed a commit to sre-bot/tidb that referenced this pull request Jun 3, 2020
Signed-off-by: sre-bot <sre-bot@pingcap.com>
@sre-bot
Copy link
Contributor

sre-bot commented Jun 3, 2020

cherry pick to release-3.1 in PR #17610

sre-bot pushed a commit to sre-bot/tidb that referenced this pull request Jun 3, 2020
Signed-off-by: sre-bot <sre-bot@pingcap.com>
@sre-bot
Copy link
Contributor

sre-bot commented Jun 3, 2020

cherry pick to release-4.0 in PR #17611

ti-srebot pushed a commit that referenced this pull request Jul 7, 2020
ti-srebot pushed a commit that referenced this pull request Jul 24, 2020
@eurekaka eurekaka deleted the index_equal branch July 24, 2020 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics sig/planner SIG: Planner status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improve row count estimation for index equal condition
4 participants