Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint stucks for over 1.5h, results in changefeed failure #11335

Open
fubinzh opened this issue Jun 21, 2024 · 1 comment
Open

checkpoint stucks for over 1.5h, results in changefeed failure #11335

fubinzh opened this issue Jun 21, 2024 · 1 comment
Labels
area/ticdc Issues or PRs related to TiCDC. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@fubinzh
Copy link

fubinzh commented Jun 21, 2024

What did you do?

  1. There are five changefeed (Kafka sink, simple protocol) running, each replicating a subset of the tables.
  2. At first there are workload running for all the changefeed, cdc lag not stable
  3. Later there is only workload running for the changefeed: 1k-odd, 1k-even
  4. Wait and check lag status

What did you expect to see?

  1. Lag for all changefeed should be normal, at least for changefeeds whose workload running the lag should be normal after incremental scan finishes.

What did you see instead?

changefeed xxx5k not restore to normal. changefeed point stucks for 1.5h +, and finally changefeed into failure state.
image
image

Versions of the cluster

Release Version: v8.2.0-alpha
Git Commit Hash: 90da67d
Git Branch: heads/refs/tags/v8.2.0-alpha
UTC Build Time: 2024-06-14 11:38:03
Go Version: go version go1.21.10 linux/amd64
Failpoint Build: false

@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Jun 21, 2024
@fubinzh
Copy link
Author

fubinzh commented Jun 28, 2024

/severity moderate
The throughtpu up to 300MB/s when the issue happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. severity/moderate type/bug The issue is confirmed as a bug.
Projects
Status: Need Triage
Development

No branches or pull requests

1 participant