Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disk_snapshot_backup: init pod may get stuck due to concurrency call to Send #52049

Closed
YuJuncen opened this issue Mar 25, 2024 · 0 comments · Fixed by #52051
Closed

disk_snapshot_backup: init pod may get stuck due to concurrency call to Send #52049

YuJuncen opened this issue Mar 25, 2024 · 0 comments · Fixed by #52051
Labels
affects-6.5 affects-7.1 affects-7.5 component/br This issue is related to BR of TiDB. severity/major type/bug The issue is confirmed as a bug.

Comments

@YuJuncen
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Run a disk snapshot backup in a cluster that huge enough.
(Or just injecting some failpoints that make sometimes renewing the lease and sending wait apply happens concurrently.

2. What did you expect to see? (Required)

It should success to prepare -- nothing wrong happens.

3. What did you see instead (Required)

We were stuck at sending wait apply. When applying for quotas.
CleanShot 2024-03-25 at 11 13 01@2x

4. What is your TiDB version? (Required)

v6.5.x
But this may still happen in master.

NOTE

We call Send concurrently over a stream. Which isn't safe according to the document of ClientStream:

// It is safe to have a goroutine calling SendMsg and another goroutine
// calling RecvMsg on the same stream at the same time, but it is not safe
// to call SendMsg on the same stream in different goroutines. It is also
// not safe to call CloseSend concurrently with SendMsg.
//
// It is not safe to modify the message after calling SendMsg. Tracing
// libraries and stats handlers may use the message lazily.
SendMsg(m any) error
@YuJuncen YuJuncen added the type/bug The issue is confirmed as a bug. label Mar 25, 2024
ti-chi-bot bot pushed a commit that referenced this issue Mar 25, 2024
@jebter jebter added the component/br This issue is related to BR of TiDB. label Apr 11, 2024
YuJuncen added a commit to YuJuncen/tidb that referenced this issue May 6, 2024
mittalrishabh pushed a commit to mittalrishabh/tidb that referenced this issue May 6, 2024
…ingcap#64)

close pingcap#52049

Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
mittalrishabh pushed a commit to mittalrishabh/tidb that referenced this issue May 30, 2024
close pingcap#52049

Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 affects-7.5 component/br This issue is related to BR of TiDB. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants