ease of use: The stats_meta data table adds two fields related to statistics collection. #49594

wentaojin · 2023-12-19T15:00:44Z

Feature Request

Is your feature request related to a problem? Please describe:
Ease of use: The stats_meta data table adds two fields related to statistics collection. One field represents the last table analyze statistics collection time, and the other field represents the source of the last table analyze statistics (internal automatic collection or external manual collection).

Describe the feature you'd like:
Ease of use: The stats_meta data table adds two fields related to statistics collection. One field represents the last table analyze statistics collection time, and the other field represents the source of the last table analyze statistics (internal automatic collection or external manual collection).

Describe alternatives you've considered:
NO

Teachability, Documentation, Adoption, Migration Strategy:
Scenes:

Used to distinguish and identify external statistical information collection scripts
Check the recent collection time and collection method of the differentiation table statistics.

Rustin170506 · 2023-12-28T07:08:53Z

You can get this information from the mysql.analyze_jobs.

For instance:

+---------+---------------------+--------------+--------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+--------------------------------------------------+-------------------------------------------------------------------------+------------+
| 2670123 | 2023-12-27 10:38:34 | co01db33     | co01table86  |                | auto analyze table all columns with 256 buckets, 500 topn, 1 samplerate                                                                                                                                                                                                                                                                                                          |         150000 | 2023-12-27 10:38:31 | 2023-12-27 10:38:34 | finished | NULL                                             | tc-tidb-0.tc-tidb-peer.partition-analyze-test-tps-3180202-1-85.svc:4000 |       NULL |
| 2670124 | 2023-12-27 10:49:00 | co01db33     | co01table125 |                | auto analyze table all columns with 256 buckets, 500 topn, 1 samplerate                                                                                                                                                                                                                                                                                                          |              0 | 2023-12-27 10:38:34 | NULL                | failed   | TiDB Server is down when running the analyze job | tc-tidb-0.tc-tidb-peer.partition-analyze-test-tps-3180202-1-85.svc:4000 |       NULL |
+---------+---------------------+--------------+--------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+---------------------+---------------------+----------+--------------------------------------------------+-------------------------------------------------------------------------+------------+

Rustin170506 · 2023-12-28T07:12:06Z

The table definition is:

CREATE TABLE IF NOT EXISTS mysql.analyze_jobs (
		id BIGINT(64) UNSIGNED NOT NULL AUTO_INCREMENT,
		update_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
		table_schema CHAR(64) NOT NULL DEFAULT '',
		table_name CHAR(64) NOT NULL DEFAULT '',
		partition_name CHAR(64) NOT NULL DEFAULT '',
		job_info TEXT NOT NULL,
		processed_rows BIGINT(64) UNSIGNED NOT NULL DEFAULT 0,
		start_time TIMESTAMP,
		end_time TIMESTAMP,
		state ENUM('pending', 'running', 'finished', 'failed') NOT NULL,
		fail_reason TEXT,
		instance VARCHAR(512) NOT NULL comment 'address of the TiDB instance executing the analyze job',
		process_id BIGINT(64) UNSIGNED comment 'ID of the process executing the analyze job',
		PRIMARY KEY (id),
		KEY (update_time)

If the job_info starts with auto analyze, then it is triggered by TiDB itself.

wentaojin · 2023-12-28T09:30:20Z

@hi-rustin the mysql.analyze_jobs table ease of use is poor，a final intuitive information is needed, not a process state. The table is used to collect historical records in increments. The second one may require manual sorting to remove duplicates and find the maximum time, because a certain table will be recorded after multiple collections. The third One has to like to judge autoanalyze

Rustin170506 · 2023-12-28T14:08:32Z

I acknowledge that it has some usability issues. However, recording only the last analysis time is also quite unreasonable. If the purpose of this requirement is to audit and analyze historical data, then analyze_jobs is the best source. stats meta does not record the time of the last analysis because analysis can either succeed or fail. Do we really need to store this information again in stats meta? In that case, stats meta would essentially become the same as the analyze_jobs table.

wentaojin · 2023-12-29T06:23:44Z

@hi-rustin

The requirement is to quickly identify the last statistical information collection time and the method of the last collection, not to analyze the audit history.
It doesn’t matter whether the collection fails or not. Even if it fails, it will still be displayed as before. As mentioned above, you need to visualize the final state data. Of course, you can also use analyze_jobs according to your example above, but it requires a lot of preprocessing, such as : Sort to find the most recent collection time of a table, and then filter whether to auto collect
As for whether to add it to the stats_meta table or the tables table, it can be displayed intuitively and it will be better to get it immediately.
The analyze_jobs table data retention time is limited. In extreme cases, the collection history of a certain table may be flushed out.

ref #49594

wentaojin added the type/feature-request Categorizes issue or PR as related to a new feature. label Dec 19, 2023

qw4990 added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner component/statistics and removed type/feature-request Categorizes issue or PR as related to a new feature. labels Dec 20, 2023

qw4990 assigned Rustin170506 Dec 21, 2023

winoros mentioned this issue Mar 4, 2024

statistics: add last_analyze_version for mysql.stats_meta #51489

Merged

13 tasks

ti-chi-bot bot pushed a commit that referenced this issue Mar 11, 2024

statistics: add last_analyze_version for mysql.stats_meta (#51489)

caf83ed

ref #49594

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ease of use: The stats_meta data table adds two fields related to statistics collection. #49594

ease of use: The stats_meta data table adds two fields related to statistics collection. #49594

wentaojin commented Dec 19, 2023

Rustin170506 commented Dec 28, 2023

Rustin170506 commented Dec 28, 2023

wentaojin commented Dec 28, 2023

Rustin170506 commented Dec 28, 2023

wentaojin commented Dec 29, 2023 •

edited

Loading

ease of use: The stats_meta data table adds two fields related to statistics collection. #49594

ease of use: The stats_meta data table adds two fields related to statistics collection. #49594

Comments

wentaojin commented Dec 19, 2023

Feature Request

Rustin170506 commented Dec 28, 2023

Rustin170506 commented Dec 28, 2023

wentaojin commented Dec 28, 2023

Rustin170506 commented Dec 28, 2023

wentaojin commented Dec 29, 2023 • edited Loading

wentaojin commented Dec 29, 2023 •

edited

Loading