Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: fix panic tidb-server doesn't release table lock (#19586) #20020

Merged
merged 5 commits into from
Sep 21, 2020

Conversation

ti-srebot
Copy link
Contributor

cherry-pick #19586 to release-4.0


What problem does this PR solve?

Close #19240

When tidb-server been killed -9(such as oom that killed by the system), the tidb-server won't release the held table locks, and this dead table lock will never be released unless user use admin cleanup table lock statement.

What is changed and how it works?

Add a background goroutine to periodic inspection the dead lock.

If the lock is held by the server that doesn't exist meta key (such as DDLAllSchemaVersions meta key) in ETCD, we think the server is dead, so we can clean the locks that hold by the dead tidb-server.

Why use DDLAllSchemaVersions meta key instead of ServerInformationPath meta key?

  1. Table lock implementation with DDL, and DDLAllSchemaVersions meta key is a related DDL meta key.
  2. DDLAllSchemaVersions TTL is 90s, but ServerInformationPath TTL is 10 min. The TTL90s is much better for usage.

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Manual test

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM

Release note

  • fix panic tidb-server doesn't release table lock

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor Author

/run-all-tests

ddl/ddl.go Outdated Show resolved Hide resolved
Copy link
Contributor

@tangenta tangenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 18, 2020
Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 19, 2020
@AilinKid
Copy link
Contributor

/run-all-tests

@ti-srebot
Copy link
Contributor Author

@lzmhhh123,Thanks for your review. The bot only counts LGTMs from Reviewers and higher roles, but you're still welcome to leave your comments.See the corresponding SIG page for more information. Related SIG: ddl(slack).

@bb7133
Copy link
Member

bb7133 commented Sep 21, 2020

/merge

@ti-srebot
Copy link
Contributor Author

Your auto merge job has been accepted, waiting for:

  • 20076
  • 20074
  • 20062

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 21, 2020
@ti-srebot
Copy link
Contributor Author

/run-all-tests

@ti-srebot ti-srebot merged commit c2dc0c3 into pingcap:release-4.0 Sep 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/4.0-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants