Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeed stucks when there are 100 changefeed and restarting PD #10798

Closed
fubinzh opened this issue Mar 17, 2024 · 5 comments
Closed

changefeed stucks when there are 100 changefeed and restarting PD #10798

fubinzh opened this issue Mar 17, 2024 · 5 comments

Comments

@fubinzh
Copy link

fubinzh commented Mar 17, 2024

What did you do?

  1. TiDB cluster deployed in GCP GKE env, with 24 TiKV (16c64g) and 9 CDC node (16c64g), cluster size ~40TB. 3 workload running, one workload with row width ~1mb, one 9kb, one 1.7kb.
  2. 100 changefeed created, each changefeed cover 40 tables.
  3. update PD configuration to trigger rolling restart

What did you expect to see?

CDC lag should be less than 10s

What did you see instead?

CDC changefeed stucks

image
image

Versions of the cluster

cdc version:
Release Version: v8.0.0
Git Commit Hash: 130403f
Git Branch: heads/refs/tags/v8.0.0
UTC Build Time: 2024-03-15 13:58:58
Go Version: go version go1.21.6 linux/amd64
Failpoint Build: false

@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Mar 17, 2024
@fubinzh
Copy link
Author

fubinzh commented Mar 17, 2024

/severity major

@sdojjy
Copy link
Member

sdojjy commented Mar 19, 2024

the test env is not stable, I retested this case with 50 changefeeds, the max changefeed LAG is less than 5s.

@fubinzh
Copy link
Author

fubinzh commented Mar 20, 2024

After PD rolling restart at 3/17 11:44, we can see that workload not balanced, cdc-8 has 4k tables, and CPU usage almost full, and disk size keep increasing and finally full.

image

image

@fubinzh
Copy link
Author

fubinzh commented Mar 25, 2024

PD side issue: tikv/pd#7973

@fubinzh
Copy link
Author

fubinzh commented May 6, 2024

Close this issue, PD issue will be tracked seperately.

@fubinzh fubinzh closed this as completed May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

4 participants