-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25903][Core] TimerTask should be synchronized on ContextBarrierState #25897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan |
This comment has been minimized.
This comment has been minimized.
|
retest this please. |
|
Test build #111198 has finished for PR 25897 at commit
|
|
Test build #111197 has finished for PR 25897 at commit
|
|
cc @jiangxb1987 |
|
good catch! LGTM. Why do we need [test] in the title? |
|
Because SPARK-25903 is about one flaky test in BarrierTaskContextSuite. I found this issue when looking into the reason. Should I create another JIRA? |
|
The [test] tag indicates the PR only changes code in tests, instead of indicating this PR targets a flaky test JIRA ticket. Let's remove the test tag. |
|
Sure. Removed. Thanks. |
|
thanks, merging to master/2.4! |
…rState ### What changes were proposed in this pull request? BarrierCoordinator sets up a TimerTask for a round of global sync. Currently the run method is synchronized on the created TimerTask. But to be synchronized with handleRequest, it should be synchronized on the ContextBarrierState object, not TimerTask object. ### Why are the changes needed? ContextBarrierState.handleRequest and TimerTask.run both access the internal status of a ContextBarrierState object. If TimerTask doesn't be synchronized on the same ContextBarrierState object, when the timer task is triggered, handleRequest still accepts new request and modify requesters field in the ContextBarrierState object. It makes the behavior inconsistency. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Test locally Closes #25897 from viirya/SPARK-25903. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit d50f6e6) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
Thank you, @viirya and @cloud-fan ! |
What changes were proposed in this pull request?
BarrierCoordinator sets up a TimerTask for a round of global sync. Currently the run method is synchronized on the created TimerTask. But to be synchronized with handleRequest, it should be synchronized on the ContextBarrierState object, not TimerTask object.
Why are the changes needed?
ContextBarrierState.handleRequest and TimerTask.run both access the internal status of a ContextBarrierState object. If TimerTask doesn't be synchronized on the same ContextBarrierState object, when the timer task is triggered, handleRequest still accepts new request and modify requesters field in the ContextBarrierState object. It makes the behavior inconsistency.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Test locally