Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: remove lockedTable cache #46555

Merged
merged 12 commits into from
Sep 2, 2023

Conversation

Rustin170506
Copy link
Member

@Rustin170506 Rustin170506 commented Aug 31, 2023

What problem does this PR solve?

Issue Number: ref #46351

Problem Summary:

What changed and how it works?

  • Removed the lockedTable cache. The reason we're removing it is because it doesn't help. It's locked and unlocked so infrequently that it doesn't matter if we query it every time. This way we don't need to load it as often.

  • Removed LoadLockedTables background task

  • Use the read lock to protect the SQLExecutor

  • Move all code to lockstats pkg

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 31, 2023
@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Merging #46555 (a0276db) into master (80da849) will decrease coverage by 0.8824%.
Report is 10 commits behind head on master.
The diff coverage is 72.1925%.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #46555        +/-   ##
================================================
- Coverage   73.3411%   72.4588%   -0.8824%     
================================================
  Files          1309       1331        +22     
  Lines        394560     403346      +8786     
================================================
+ Hits         289375     292260      +2885     
- Misses        86769      92556      +5787     
- Partials      18416      18530       +114     
Flag Coverage Δ
integration 25.6415% <19.7861%> (?)
unit 73.3467% <72.1925%> (+0.0055%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 54.0444% <ø> (ø)
parser 84.9628% <ø> (-0.0145%) ⬇️
br 47.4689% <ø> (-4.8818%) ⬇️

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 31, 2023
@Rustin170506
Copy link
Member Author

  1. Start the TiDB cluster: tiup playground nightly --tiflash 0 --db.binpath /Users/hi-rustin/GolandProjects/tidb/bin/tidb-server
  2. Create table and analyze it:
create table t(a int, b int);
insert into t values (1,2), (3,4), (5,6), (7,8);
analyze table t;
show warnings;
  1. Lock table: lock stats t;
  2. Analyze it again: analyze table t;
  3. Check the warnings: show warnings;
mysql> show warnings;
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                 |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Note    | 1105 | Analyze use auto adjusted sample rate 1.000000 for table test.t, reason to use this rate is "use min(1, 110000/4) as the sample-rate=1" |
| Warning | 1105 | skip analyze locked table: t                                                                                                            |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------+
  1. Check the mysql.stats_table_locked table:
mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from stats_table_locked;
+----------+--------------+-------+--------------------+
| table_id | modify_count | count | version            |
+----------+--------------+-------+--------------------+
|       98 |            4 |     4 | 443863948666601483 |
+----------+--------------+-------+--------------------+
1 row in set (0.00 sec)
  1. Try to lock it again: lock stats t;
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> lock stats t;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> show warnings;
+---------+------+-----------------------------------+
| Level   | Code | Message                           |
+---------+------+-----------------------------------+
| Warning | 1105 | skip locking locked table: test.t |
+---------+------+-----------------------------------+
1 row in set (0.00 sec)
  1. Show locked stats: show stats_locked;.
mysql> show stats_locked;
+---------+------------+----------------+--------+
| Db_name | Table_name | Partition_name | Status |
+---------+------------+----------------+--------+
| test    | t          |                | locked |
+---------+------------+----------------+--------+
1 row in set (0.00 sec)
  1. Unlock the table: unlock stats t;
  2. Check the mysql.stats_table_locked table again:
mysql> select * from stats_table_locked;
Empty set (0.00 sec)
  1. Show locked stats: show stats_locked;
mysql> show stats_locked;
Empty set (0.00 sec)
  1. Unlock the table again and check warnings: unlock stats t;
mysql> unlock stats t;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+---------------------------------------+
| Level   | Code | Message                               |
+---------+------+---------------------------------------+
| Warning | 1105 | skip unlocking unlocked table: test.t |
+---------+------+---------------------------------------+
1 row in set (0.00 sec)

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 31, 2023
}

// IsTableLocked check whether table is locked in handle with Handle.Mutex
func (h *Handle) IsTableLocked(tableID int64) bool {
func (h *Handle) IsTableLocked(tableID int64) (bool, error) {
Copy link
Contributor

@qw4990 qw4990 Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we allow this method to accept multiple table IDs like IsTableLocked(tableIDs ...int64) ([]bool, error) to get a better performance? Otherwise when executing show stats locked and there a N tables, we have to execute SELECT table_id FROM mysql.stats_table_locked N times, which is a little slow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried it this morning. It makes the code complicated. I will try it again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in a6a0bf3 (#46555)

@Rustin170506 Rustin170506 force-pushed the rustin-patch-unlock-cache branch 3 times, most recently from fe795da to a6a0bf3 Compare September 1, 2023 06:01
Comment on lines 728 to 739
checkedTableIDs, err := h.QueryTablesLockedStatuses(tids...)
if err != nil {
logutil.BgLogger().Error("check table lock failed",
zap.String("category", "stats"), zap.Error(err))
continue
}

for _, tbl := range tbls {
//if table locked, skip analyze
if h.IsTableLocked(tbl.Meta().ID) {
// If table locked, skip analyze.
if checkedTableIDs[tbl.Meta().ID] {
logutil.BgLogger().Info("skip analyze locked table", zap.String("category", "stats"),
zap.String("db", db), zap.String("table", tbl.Meta().Name.O))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments here to notice this check is not complete. For example, some other TiDB node may lock or unlock this table between this check and the analyze command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to enhance or remove this logic in the future.

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Sep 1, 2023
statistics: rename

statistics: rename

statistics: rename

statistics: rename
@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Sep 1, 2023
@ti-chi-bot
Copy link

ti-chi-bot bot commented Sep 1, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-09-01 06:29:09.957485443 +0000 UTC m=+2081314.506501415: ☑️ agreed by qw4990.
  • 2023-09-01 07:48:16.953042852 +0000 UTC m=+2086061.502058823: ☑️ agreed by hawkingrei.

@Rustin170506
Copy link
Member Author

/retest

1 similar comment
@Rustin170506
Copy link
Member Author

/retest

Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we switching from Lock to RLock? Would this possibly cause problems when there are concurrent operations?

@Rustin170506
Copy link
Member Author

Why are we switching from Lock to RLock? Would this possibly cause problems when there are concurrent operations?

It was a mistake. I reverted it.

Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. A typo in filename statistics/handle/handletest/statslock/stats_lcok_test.go. Seems it has been there for a long time.
  2. Personally, I think it'll be better if we extract something like query_lock.go from lock_stats.go. Feel free to update it or keep it unchanged.

} else {
tbl, ok := infoSchema.TableByID(tid)
if !ok {
// Ignore this table because it may have been dropped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed because dropping this table also wouldn't make this case happen.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Sep 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hawkingrei, qw4990, time-and-fate

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [hawkingrei,qw4990,time-and-fate]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Rustin170506
Copy link
Member Author

/retest

3 similar comments
@Rustin170506
Copy link
Member Author

/retest

@Rustin170506
Copy link
Member Author

/retest

@Rustin170506
Copy link
Member Author

/retest

@hawkingrei
Copy link
Member

/test all

@ti-chi-bot ti-chi-bot bot merged commit 93dce26 into pingcap:master Sep 2, 2023
4 of 9 checks passed
@Rustin170506
Copy link
Member Author

  1. A typo in filename statistics/handle/handletest/statslock/stats_lcok_test.go. Seems it has been there for a long time.
  2. Personally, I think it'll be better if we extract something like query_lock.go from lock_stats.go. Feel free to update it or keep it unchanged.

I will deal with them in my next PR.
Thanks for your review! 💚 💙 💜 💛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved component/statistics lgtm release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants