Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze large table failed,and tidb-server will do analysis until success,when gc life time is short #33276

Closed
lddlww opened this issue Mar 21, 2022 · 13 comments
Assignees
Labels
may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@lddlww
Copy link

lddlww commented Mar 21, 2022

Bug Report

tidb-server analyze one table which size is large,and analyze op will failed,then tidb-server will do analysis until success,this will result in large receive traffic in tidb-server

1. Minimal reproduce step (Required)

no

2. What did you expect to see? (Required)

tidb-server always do analysis until success

3. What did you see instead (Required)

tidb-server will auto adjust gc life time to large table when analyze table

4. What is your TiDB version? (Required)

v5.3.0

tidb-server.log:

[2022/03/12 09:12:53.878 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5079512/1.015878e+07>0.5)"]
[2022/03/12 09:24:55.637 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 09:12:53.878 +08:00] ["end time"=2022/03/12 09:24:55.637 +08:00] [cost=12m1.758415741s]
[2022/03/12 09:24:56.292 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 09:24:55.636 +08:00] ["end time"=2022/03/12 09:24:56.292 +08:00] [cost=656.073854ms]
[2022/03/12 09:24:56.503 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 09:24:56.292 +08:00] ["end time"=2022/03/12 09:24:56.503 +08:00] [cost=210.846921ms]
[2022/03/12 09:24:57.650 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 09:24:56.503 +08:00] ["end time"=2022/03/12 09:24:57.650 +08:00] [cost=1.146822065s]
[2022/03/12 09:24:57.650 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=12m3.772639707s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 09:12:53.866 +0800 CST, GC safe point is 2022-03-12 09:14:47.866 +0800 CST"]
[2022/03/12 09:24:57.654 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5079512/1.015878e+07>0.5)"]
[2022/03/12 09:44:57.647 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 09:24:57.654 +08:00] ["end time"=2022/03/12 09:44:57.647 +08:00] [cost=19m59.992030597s]
[2022/03/12 09:44:58.287 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 09:44:57.646 +08:00] ["end time"=2022/03/12 09:44:58.287 +08:00] [cost=640.718599ms]
[2022/03/12 09:44:58.490 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 09:44:58.287 +08:00] ["end time"=2022/03/12 09:44:58.490 +08:00] [cost=203.045258ms]
[2022/03/12 09:44:59.623 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 09:44:58.490 +08:00] ["end time"=2022/03/12 09:44:59.623 +08:00] [cost=1.132833191s]
[2022/03/12 09:44:59.623 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=20m1.969094961s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 09:24:57.615 +0800 CST, GC safe point is 2022-03-12 09:34:47.866 +0800 CST"]
[2022/03/12 09:44:59.627 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5081252/1.015878e+07>0.5)"]
[2022/03/12 10:04:59.013 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 09:44:59.627 +08:00] ["end time"=2022/03/12 10:04:59.013 +08:00] [cost=19m59.385525484s]
[2022/03/12 10:04:59.673 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 10:04:59.012 +08:00] ["end time"=2022/03/12 10:04:59.673 +08:00] [cost=660.161927ms]
[2022/03/12 10:04:59.876 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 10:04:59.673 +08:00] ["end time"=2022/03/12 10:04:59.876 +08:00] [cost=203.35878ms]
[2022/03/12 10:05:01.034 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 10:04:59.876 +08:00] ["end time"=2022/03/12 10:05:01.034 +08:00] [cost=1.157870254s]
[2022/03/12 10:05:01.034 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=20m1.40748911s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 09:44:59.616 +0800 CST, GC safe point is 2022-03-12 09:54:47.865 +0800 CST"]
[2022/03/12 10:05:01.038 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5082988/1.015878e+07>0.5)"]
[2022/03/12 10:25:00.350 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 10:05:01.039 +08:00] ["end time"=2022/03/12 10:25:00.350 +08:00] [cost=19m59.311183017s]
[2022/03/12 10:25:01.075 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 10:25:00.350 +08:00] ["end time"=2022/03/12 10:25:01.075 +08:00] [cost=724.820285ms]
[2022/03/12 10:25:01.270 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 10:25:01.075 +08:00] ["end time"=2022/03/12 10:25:01.270 +08:00] [cost=195.312692ms]
[2022/03/12 10:25:02.456 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 10:25:01.270 +08:00] ["end time"=2022/03/12 10:25:02.456 +08:00] [cost=1.186300445s]
[2022/03/12 10:25:02.457 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=20m1.418291177s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 10:05:01.015 +0800 CST, GC safe point is 2022-03-12 10:14:47.866 +0800 CST"]
[2022/03/12 10:25:02.461 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5082988/1.015878e+07>0.5)"]
[2022/03/12 10:45:01.950 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 10:25:02.462 +08:00] ["end time"=2022/03/12 10:45:01.950 +08:00] [cost=19m59.487746958s]
[2022/03/12 10:45:02.631 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 10:45:01.950 +08:00] ["end time"=2022/03/12 10:45:02.631 +08:00] [cost=681.3097ms]
[2022/03/12 10:45:02.837 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 10:45:02.631 +08:00] ["end time"=2022/03/12 10:45:02.837 +08:00] [cost=205.913852ms]
[2022/03/12 10:45:03.966 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 10:45:02.837 +08:00] ["end time"=2022/03/12 10:45:03.966 +08:00] [cost=1.129559762s]
[2022/03/12 10:45:03.966 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=20m1.505077085s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 10:25:02.415 +0800 CST, GC safe point is 2022-03-12 10:34:47.866 +0800 CST"]
[2022/03/12 10:45:03.971 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5082988/1.015878e+07>0.5)"]
[2022/03/12 11:05:03.416 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 10:45:03.971 +08:00] ["end time"=2022/03/12 11:05:03.416 +08:00] [cost=19m59.444767033s]
[2022/03/12 11:05:04.132 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 11:05:03.416 +08:00] ["end time"=2022/03/12 11:05:04.132 +08:00] [cost=716.319883ms]
[2022/03/12 11:05:04.319 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 11:05:04.132 +08:00] ["end time"=2022/03/12 11:05:04.319 +08:00] [cost=186.763278ms]
[2022/03/12 11:05:05.457 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 11:05:04.319 +08:00] ["end time"=2022/03/12 11:05:05.457 +08:00] [cost=1.138190954s]
[2022/03/12 11:05:05.457 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=20m1.486454878s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 10:45:03.965 +0800 CST, GC safe point is 2022-03-12 10:54:47.866 +0800 CST"]
[2022/03/12 11:05:05.877 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5084719/1.015878e+07>0.5)"]
[2022/03/12 11:24:55.267 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze columns"] ["start time"=2022/03/12 11:05:05.878 +08:00] ["end time"=2022/03/12 11:24:55.267 +08:00] [cost=19m49.389153925s]
[2022/03/12 11:24:55.950 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_order_serial_number"] ["start time"=2022/03/12 11:24:55.267 +08:00] ["end time"=2022/03/12 11:24:55.950 +08:00] [cost=683.710463ms]
[2022/03/12 11:24:56.167 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5"] ["start time"=2022/03/12 11:24:55.950 +08:00] ["end time"=2022/03/12 11:24:56.167 +08:00] [cost=216.737852ms]
[2022/03/12 11:24:57.338 +08:00] [INFO] [analyze.go:129] ["analyze table `db`.`detail` has failed"] [partition=] ["job info"="auto analyze index idx_md5_create"] ["start time"=2022/03/12 11:24:56.167 +08:00] ["end time"=2022/03/12 11:24:57.338 +08:00] [cost=1.170859545s]
[2022/03/12 11:24:57.338 +08:00] [ERROR] [update.go:1091] ["[stats] auto analyze failed"] [sql="analyze table `db`.`detail`] [cost_time=19m51.460990431s] [error="[tikv:9006]GC life time is shorter than transaction duration, transaction starts at 2022-03-12 11:05:05.866 +0800 CST, GC safe point is 2022-03-12 11:14:47.866 +0800 CST"]
[2022/03/12 11:24:57.343 +08:00] [INFO] [update.go:992] ["[stats] auto analyze triggered"] [sql="analyze table `db`.`detail`] [reason="too many modifications(5086443/1.015878e+07>0.5)"]
@lddlww lddlww added the type/bug The issue is confirmed as a bug. label Mar 21, 2022
@lddlww lddlww changed the title analyze table failed,but always analyze large table failed,and tidb-server will do analysis until success,this will result large network receive traffice in tidb-server Mar 21, 2022
@lddlww lddlww changed the title analyze large table failed,and tidb-server will do analysis until success,this will result large network receive traffice in tidb-server analyze large table failed,and tidb-server will do analysis until success,this will result large network receive traffic in tidb-server Mar 21, 2022
@lddlww lddlww changed the title analyze large table failed,and tidb-server will do analysis until success,this will result large network receive traffic in tidb-server analyze large table failed,and tidb-server will do analysis until success,when gc Mar 21, 2022
@lddlww lddlww changed the title analyze large table failed,and tidb-server will do analysis until success,when gc analyze large table failed,and tidb-server will do analysis until success,when gc life time is short Mar 21, 2022
@aytrack aytrack added the sig/planner SIG: Planner label Mar 21, 2022
@chrysan
Copy link
Contributor

chrysan commented Mar 21, 2022

Two issues behind this one:

  1. internal jobs' startTs are not tracked for the usage of GC safepoint. They will be tracked after v6.0.
  2. The order of table analyze jobs need to be optimized, we are working on it.

@lddlww
Copy link
Author

lddlww commented Mar 21, 2022

@chrysan thanks for reply

and i looking forward to tidb v6.0

@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.0 labels Mar 23, 2022
@kiakaku
Copy link

kiakaku commented Mar 24, 2022

@chrysan thanks for reply

and i looking forward to tidb v6.0

hi @lddlww any suggestion for stop auto-analyze process which failed

@chrysan
Copy link
Contributor

chrysan commented Mar 27, 2022

@chrysan thanks for reply
and i looking forward to tidb v6.0

hi @lddlww any suggestion for stop auto-analyze process which failed

You can try "drop stats".

@kiakaku
Copy link

kiakaku commented Mar 28, 2022

@chrysan thanks for reply
and i looking forward to tidb v6.0

hi @lddlww any suggestion for stop auto-analyze process which failed

You can try "drop stats".

OMG, "drop stats" meaning delete statistic, but i want to stop running analyze process

@glkappe
Copy link

glkappe commented Apr 6, 2022

@chrysan thanks for reply
and i looking forward to tidb v6.0

hi @lddlww any suggestion for stop auto-analyze process which failed

You can try "drop stats".

OMG, "drop stats" meaning delete statistic, but i want to stop running analyze process

internal sql is not support stop, but you can use set a configuration to stop auto analyze.

https://docs.pingcap.com/tidb/stable/tidb-configuration-file#run-auto-analyze

@chrysan
Copy link
Contributor

chrysan commented Apr 6, 2022

@chrysan thanks for reply
and i looking forward to tidb v6.0

hi @lddlww any suggestion for stop auto-analyze process which failed

You can try "drop stats".

OMG, "drop stats" meaning delete statistic, but i want to stop running analyze process

internal sql is not support stop, but you can use set a configuration to stop auto analyze.

docs.pingcap.com/tidb/stable/tidb-configuration-file#run-auto-analyze

run-auto-analyze config stops all tables' auto-analyze, "drop stats [tablename]" just drops stats and stops auto-analyze of one table. We will provide a more friendly way to stop auto-analyze of just one table later. Sorry for the inconvenience.

@kiakaku
Copy link

kiakaku commented Apr 12, 2022

thank @chrysan

@chrysan
Copy link
Contributor

chrysan commented Apr 20, 2022

Since the error will not be thrown since v6.0 and further optimization is in developing, so close this issue.

@chrysan chrysan closed this as completed Apr 20, 2022
@coderplay
Copy link
Contributor

@chrysan Could you please share the link here for the further optimization?

@chrysan
Copy link
Contributor

chrysan commented May 6, 2022

@chrysan Could you please share the link here for the further optimization?

It's WIP, will paste the link later.

@chrysan
Copy link
Contributor

chrysan commented Sep 14, 2022

After #35232, analyze large table will not throw gc life time error.

@Robin222222
Copy link

the samething happen in v6.1.0, that is no resolved:
tidb报错

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

10 participants