Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: add some doc for the exp feature #9891

Merged
merged 27 commits into from
Sep 23, 2022

Conversation

winoros
Copy link
Member

@winoros winoros commented Aug 9, 2022

First-time contributors' checklist

What is changed, added or deleted? (Required)

add docs for the missing part of statistics of the optimizer
close part of #3155

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v6.2 (TiDB 6.2 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)
  • v5.2 (TiDB 5.2 versions)
  • v5.1 (TiDB 5.1 versions)
  • v5.0 (TiDB 5.0 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Aug 9, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lilin90

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot requested a review from shichun-0415 August 9, 2022 06:58
@ti-chi-bot ti-chi-bot added missing-translation-status This PR does not have translation status info. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 9, 2022
@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 9, 2022
@TomShawn TomShawn added area/planner Indicates that the Issue or PR belongs to the area of SQL planner or optimizer. add-missing-docs Add missing system variables to documentation translation/doing This PR's assignee is translating this PR. status/PTAL This PR is ready for reviewing. and removed missing-translation-status This PR does not have translation status info. labels Aug 9, 2022
@TomShawn TomShawn self-assigned this Aug 9, 2022
@TomShawn TomShawn self-requested a review August 9, 2022 07:23
@TomShawn TomShawn added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 9, 2022
@TomShawn TomShawn requested review from time-and-fate and removed request for shichun-0415 August 10, 2022 06:27
@TomShawn TomShawn removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 10, 2022
Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to specify that this feature is experimental and is not a GA feature.

CREATE TABLE t(col1 INT, col2 INT, KEY(col1), KEY(col2));
```

Suppose that the `col1` and `col2` of the table `t` both obey monotonically increasing constraints in row order, i.e., the values of `col1` and `col2` are strictly correlated in order (correlation value of 1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(correlation value of 1) seems a bit confusing.

- `stats_type` is the type of the extended statistics. Now it only has one possible value `correlation`.
- `column_name` specifies the column group. It can be multiple columns. For `correlation` type, there should be and only be two columns.

The extended statistics will be collected if the `mysql.stats_extended` has the corresponding record when we run the `ANALYZE` command. And the `status` column will be set to `1`, and the `version` column will be set to the new timestamp.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that if we are exposing too many implementation details to the doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. We can have a separated paragraph to show how to check valid extended statistics, in which mysql.stats_extended is described.

Comment on lines 79 to 90
## The cache of the Extended Statistics

Each TiDB node will maintain a cache for the extended statistics to improve the efficiency of visiting the extended statistics. TiDB will load the table `mysql.stats_extended` periodically to ensure that the cache is kept the same as the data in the table. Each row in the table `mysql.stats_extended` records a column `version`. Once the row is updated, the value of the column `version` will be increased so that we can load the table into the memory incrementally instead of a full loading.
To delete a record of the extended statistics, TiDB provides the following command:

{{< copyable "sql" >}}

```sql
ALTER TABLE table_name DROP STATS_EXTENDED stats_name;
```

This command will mark the value of the corresponding record in the table `mysql.stats_extended`'s column `status` to `2`(meaning that the record is deleted) instead of deleting the record directly. Other TiDBs will read this change and delete the record in their memory cache. The background garbage collection will delete the record eventually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


The way mentioned in the chapter [Introduction to Statistics](/statistics.md) is also suitable for extended statistics. The dump result is in the same JSON file as the normal statistics.

## The switch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since extended statistics is disabled by default, it makes more sense to put "how to enable extended statistics" to the beginning of this page.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 22, 2022
@TomShawn TomShawn added translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR. and removed translation/doing This PR's assignee is translating this PR. labels Sep 22, 2022
@TomShawn
Copy link
Contributor

/remove-status LGT1
/status LGT2

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 23, 2022
@TomShawn
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 820992a

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 23, 2022
@TomShawn TomShawn added the needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. label Sep 23, 2022
@ti-chi-bot ti-chi-bot merged commit fad6434 into pingcap:master Sep 23, 2022
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #10533.

ti-chi-bot pushed a commit to ti-chi-bot/docs that referenced this pull request Sep 23, 2022
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #10534.

ti-chi-bot pushed a commit to ti-chi-bot/docs that referenced this pull request Sep 23, 2022
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
TomShawn added a commit that referenced this pull request Sep 26, 2022
* This is an automated cherry-pick of #10534

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

* Apply suggestions from code review

* resolve

* Update extended-statistics.md

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
@shichun-0415 shichun-0415 mentioned this pull request Feb 2, 2023
22 tasks
@lilin90 lilin90 assigned lilin90 and unassigned TomShawn May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-missing-docs Add missing system variables to documentation area/planner Indicates that the Issue or PR belongs to the area of SQL planner or optimizer. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/PTAL This PR is ready for reviewing. translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants