Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Sep 23, 2019

What changes were proposed in this pull request?

BarrierCoordinator sets up a TimerTask for a round of global sync. Currently the run method is synchronized on the created TimerTask. But to be synchronized with handleRequest, it should be synchronized on the ContextBarrierState object, not TimerTask object.

Why are the changes needed?

ContextBarrierState.handleRequest and TimerTask.run both access the internal status of a ContextBarrierState object. If TimerTask doesn't be synchronized on the same ContextBarrierState object, when the timer task is triggered, handleRequest still accepts new request and modify requesters field in the ContextBarrierState object. It makes the behavior inconsistency.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test locally

@viirya
Copy link
Member Author

viirya commented Sep 23, 2019

cc @cloud-fan

@SparkQA

This comment has been minimized.

@viirya
Copy link
Member Author

viirya commented Sep 23, 2019

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111198 has finished for PR 25897 at commit a5b5f70.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111197 has finished for PR 25897 at commit a5b5f70.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 23, 2019

cc @jiangxb1987

@cloud-fan
Copy link
Contributor

good catch! LGTM. Why do we need [test] in the title?

@viirya
Copy link
Member Author

viirya commented Sep 23, 2019

Because SPARK-25903 is about one flaky test in BarrierTaskContextSuite. I found this issue when looking into the reason. Should I create another JIRA?

@cloud-fan
Copy link
Contributor

The [test] tag indicates the PR only changes code in tests, instead of indicating this PR targets a flaky test JIRA ticket.

Let's remove the test tag.

@viirya viirya changed the title [SPARK-25903][Core][Test] TimerTask should be synchronized on ContextBarrierState [SPARK-25903][Core] TimerTask should be synchronized on ContextBarrierState Sep 23, 2019
@viirya
Copy link
Member Author

viirya commented Sep 23, 2019

Sure. Removed. Thanks.

@cloud-fan cloud-fan closed this in d50f6e6 Sep 23, 2019
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.4!

cloud-fan pushed a commit that referenced this pull request Sep 23, 2019
…rState

### What changes were proposed in this pull request?

BarrierCoordinator sets up a TimerTask for a round of global sync. Currently the run method is synchronized on the created TimerTask. But to be synchronized with handleRequest, it should be synchronized on the ContextBarrierState object, not TimerTask object.

### Why are the changes needed?

ContextBarrierState.handleRequest and TimerTask.run both access the internal status of a ContextBarrierState object. If TimerTask doesn't be synchronized on the same ContextBarrierState object, when the timer task is triggered, handleRequest still accepts new request and modify  requesters field in the ContextBarrierState object. It makes the behavior inconsistency.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Test locally

Closes #25897 from viirya/SPARK-25903.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit d50f6e6)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@dongjoon-hyun
Copy link
Member

Thank you, @viirya and @cloud-fan !

@viirya viirya deleted the SPARK-25903 branch December 27, 2023 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants