Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix the estimation error on normal column when collation enabled #18104

Merged
merged 4 commits into from
Jul 1, 2020

Conversation

winoros
Copy link
Member

@winoros winoros commented Jun 17, 2020

What problem does this PR solve?

Issue Number: close #14689

Problem Summary:

For index, its key is already generated by the sort key by the collation information. And when we query the index estimation, we also use EncodeKey which will first convert the column value to sort key then encode it. So it's automatically correct for index without any additional change.

But for a column, when sampling we use its original value thus when query it with count-min sketch or histogram. We'll get a wrong answer since the order information is lost.

What is changed and how it works?

What's Changed:

For tikv part, use the sort key generated by the collation as the sampling data. tikv/tikv#8105
And when querying, convert to sort key first.

How it Works:

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test

Side effects

  • Performance regression
    • Consumes more CPU

Release note

  • Fix the row count estimation error for a non-index column with collation enabled.

@winoros
Copy link
Member Author

winoros commented Jun 17, 2020

/rebuild

@codecov
Copy link

codecov bot commented Jun 17, 2020

Codecov Report

Merging #18104 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #18104   +/-   ##
===========================================
  Coverage   79.5315%   79.5315%           
===========================================
  Files           535        535           
  Lines        144725     144725           
===========================================
  Hits         115102     115102           
  Misses        20353      20353           
  Partials       9270       9270           

@winoros
Copy link
Member Author

winoros commented Jun 17, 2020

/run-integration-common-test tikv=pr/8105

@winoros
Copy link
Member Author

winoros commented Jun 17, 2020

/run-common-test

@winoros
Copy link
Member Author

winoros commented Jun 18, 2020

/run-integration-common-test tikv=pr/8105

Copy link
Member

@wjhuang2016 wjhuang2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@qw4990 qw4990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot
Copy link
Contributor

@qw4990, Thanks for your review, however we are sorry that your vote won't be count. You are not a reviewer or committer or co-leader or leader for the related sigs:planner(slack).

@qw4990
Copy link
Contributor

qw4990 commented Jul 1, 2020

/merge

@ti-srebot
Copy link
Contributor

Sorry @qw4990, you don't have permission to trigger auto merge event on this branch. You are not a committer or co-leader or leader for the related sigs:planner(slack).

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-all-tests tikv=pr/8105

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-integration-br-tests tikv=pr/8105
/run-integration-copr-tests tikv=pr/8105

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-integration-br-tests tikv=pr/8105

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-integration-br-tests tikv=pr/8105 tiflash=master

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-unit-test

@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-unit-test

1 similar comment
@winoros
Copy link
Member Author

winoros commented Jul 1, 2020

/run-unit-test

@winoros winoros merged commit ceec9d9 into pingcap:master Jul 1, 2020
@winoros winoros deleted the analyze-index branch July 1, 2020 09:11
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jul 1, 2020
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #18311

winoros added a commit that referenced this pull request Sep 8, 2020
…18104) (#18311)

Co-authored-by: Yiding Cui <winoros@gmail.com>
Co-authored-by: Arenatlx <ailinsilence4@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics sig/planner SIG: Planner type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

statistics: update histogram to support collations in TiDB.
4 participants