-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix][Broker] Fix race condition between timeout and completion in OpAddEntry
#15233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Good catch! btw. previous revisit to the solution was in #10740 |
Jason918
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
michaeljmarshall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and thank you for the great analysis @mattisonchao!
(cherry picked from commit b083e9a)
(cherry picked from commit b083e9a)
(cherry picked from commit b083e9a)
|
@lhotari this pull must be ported to branch-2.10 also, thanks |
(cherry picked from commit b083e9a)
(cherry picked from commit b083e9a)
(cherry picked from commit b083e9a)
(cherry picked from commit b083e9a)
Motivation
In #4455 and #10740, The
OpAddEntrytry to usingopAddEntry.addOpCountto ensure theopAddEntrywe got is what we want.(avoid recycle cause race condition)It seems exists a case causing another race condition. consider the case as follow:
Precondition
OpAddEntrythat name is A.OpAddEntry.pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Lines 3825 to 3840 in caf7648
Race condition process
OpAddEntryA and continues to run until line 3835.OpAddEntryA, completes and recycles.OpAddEntryA is taken out of the recycling pool by other threads asOpAddEntryB. But the timeout thread has passed the timeout check(line 3834), andaddOpCountis also obtained from OpAddEntry B, which will cause thecompareAndSetcheck to pass.Affect
OpAddEntrycomplete timeout.Modifications
opAddCountbefore timeout judgment.Verifying this change
Documentation
no-need-doc