Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.4.1: PD tso client Panic after inject 2s network loss to pd leader #4884

Closed
mayjiang0203 opened this issue May 5, 2022 · 3 comments · Fixed by #4885
Closed

v5.4.1: PD tso client Panic after inject 2s network loss to pd leader #4884

mayjiang0203 opened this issue May 5, 2022 · 3 comments · Fixed by #4885

Comments

@mayjiang0203
Copy link

mayjiang0203 commented May 5, 2022

Bug Report

tcms plan: [plan/784399] [endless-oltp-sybench-write-only-rel@v5.4.1]
tcms case: oltp-pd-rel-leader-network-loss

What did you do?

Inject 2s network loss to pd leader:

[2022/05/02 18:41:19.672 +08:00] [INFO] [chaos.go:93] ["Run chaos"] [name="pd network loss"] [selectors="[endless-oltp-tps-784399-1-760/tc-pd-0]"] [SelectorsRetainPolicy(selectors)=
"[endless-oltp-tps-784399-1-760/tc-pd-0]"] [targetSelectors="[nil]"] [TargetSelectorsRetainPolicy(targetSelectors)="[nil]"] [experimentSpec="NetworkLossSpec{Duration: "2s", Schedu
ler: ExperimentScheduler{Cron: "@every 36s"}, Loss: "100", Correlation: "70"}"]

What did you expect to see?

No panic in all components.
image

What did you see instead?

Panic happen in one tidb.

What version of PD are you using (pd-server -V)?

UTC Build Time: 2022-04-29 06:35:41
Rust Version: rustc 1.56.0-nightly (2faabf579 2021-07-27)
Enable Features: jemalloc mem-profiling portable sse test-engines-rocksdb cloud-aws cloud-gcp cloud-azure
Profile: dist_release
2022-05-02T17:49:18.363+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more informat
ion, please refer to https://github.com/pingcap/test-infra/discussions/129
./pd-server -V
Release Version: v5.4.1
Edition: Community
Git Commit Hash: 18098e9
Git Branch: heads/refs/tags/v5.4.1
UTC Build Time: 2022-04-29 01:18:56
2022-05-02T17:49:18.595+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more informat
ion, please refer to https://github.com/pingcap/test-infra/discussions/129
./tidb-server -V
Release Version: v5.4.1
Edition: Community
Git Commit Hash: cd60925897337d790469cc8293f0cbb3a2bdcb36
Git Branch: heads/refs/tags/v5.4.1
UTC Build Time: 2022-04-29 06:38:01
GoVersion: go1.16.4
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

@mayjiang0203 mayjiang0203 added the type/bug The issue is confirmed as a bug. label May 5, 2022
@mayjiang0203
Copy link
Author

/assign JmPotato
/severity Major

@mayjiang0203
Copy link
Author

/label affects-5.0
/label affects-5.1
/label affects-5.2
/label affects-5.3
/label affects-5.4
/label affects-6.0

@JmPotato
Copy link
Member

JmPotato commented May 6, 2022

This bug will occur when the following conditions are met:

  1. PD-0 resigns the leader.
  2. PD-0 updates the TSO right after resetting it concurrently and leaves the non-empty TSO in memory.
  3. PD-0 becomes the leader again.
  4. PD-0 receives a TSO request after the leadership is available and before the TSO synchronization is done.
  5. A smaller TSO will be generated with this request.
  6. BOOM!!!

ti-chi-bot added a commit that referenced this issue May 6, 2022
close #4884

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot added a commit that referenced this issue Jun 13, 2022
close #4884, ref #4885

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot added a commit that referenced this issue Jun 13, 2022
close #4884, ref #4885

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot added a commit that referenced this issue Jun 13, 2022
close #4884, ref #4885

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot added a commit that referenced this issue Jun 13, 2022
close #4884, ref #4885

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: JmPotato <ghzpotato@gmail.com>
ti-chi-bot added a commit that referenced this issue Jun 14, 2022
close #4884, ref #4885

tso: fix the corner case that may cause TSO fallback

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: JmPotato <ghzpotato@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants