-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: add some doc for the exp feature #9891
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify that this feature is experimental and is not a GA feature.
extended-statistics.md
Outdated
CREATE TABLE t(col1 INT, col2 INT, KEY(col1), KEY(col2)); | ||
``` | ||
|
||
Suppose that the `col1` and `col2` of the table `t` both obey monotonically increasing constraints in row order, i.e., the values of `col1` and `col2` are strictly correlated in order (correlation value of 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(correlation value of 1)
seems a bit confusing.
extended-statistics.md
Outdated
- `stats_type` is the type of the extended statistics. Now it only has one possible value `correlation`. | ||
- `column_name` specifies the column group. It can be multiple columns. For `correlation` type, there should be and only be two columns. | ||
|
||
The extended statistics will be collected if the `mysql.stats_extended` has the corresponding record when we run the `ANALYZE` command. And the `status` column will be set to `1`, and the `version` column will be set to the new timestamp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that if we are exposing too many implementation details to the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. We can have a separated paragraph to show how to check valid extended statistics, in which mysql.stats_extended
is described.
extended-statistics.md
Outdated
## The cache of the Extended Statistics | ||
|
||
Each TiDB node will maintain a cache for the extended statistics to improve the efficiency of visiting the extended statistics. TiDB will load the table `mysql.stats_extended` periodically to ensure that the cache is kept the same as the data in the table. Each row in the table `mysql.stats_extended` records a column `version`. Once the row is updated, the value of the column `version` will be increased so that we can load the table into the memory incrementally instead of a full loading. | ||
To delete a record of the extended statistics, TiDB provides the following command: | ||
|
||
{{< copyable "sql" >}} | ||
|
||
```sql | ||
ALTER TABLE table_name DROP STATS_EXTENDED stats_name; | ||
``` | ||
|
||
This command will mark the value of the corresponding record in the table `mysql.stats_extended`'s column `status` to `2`(meaning that the record is deleted) instead of deleting the record directly. Other TiDBs will read this change and delete the record in their memory cache. The background garbage collection will delete the record eventually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
extended-statistics.md
Outdated
|
||
The way mentioned in the chapter [Introduction to Statistics](/statistics.md) is also suitable for extended statistics. The dump result is in the same JSON file as the normal statistics. | ||
|
||
## The switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since extended statistics is disabled by default, it makes more sense to put "how to enable extended statistics" to the beginning of this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
/remove-status LGT1 |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 820992a
|
In response to a cherrypick label: new pull request created: #10533. |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
In response to a cherrypick label: new pull request created: #10534. |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
* This is an automated cherry-pick of #10534 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> * Apply suggestions from code review * resolve * Update extended-statistics.md Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
First-time contributors' checklist
What is changed, added or deleted? (Required)
add docs for the missing part of statistics of the optimizer
close part of #3155
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?