Skip to content

[SPARK-3116] Remove the excessive lockings in TorrentBroadcast #2028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Aug 19, 2014

No description provided.

@rxin
Copy link
Contributor Author

rxin commented Aug 19, 2014

cc @mosharaf @shivaram

@SparkQA
Copy link

SparkQA commented Aug 19, 2014

QA tests have started for PR 2028 at commit 92c62a5.

  • This patch merges cleanly.

@shivaram
Copy link
Contributor

Can you clarify what is the guarantee that we are trying to maintain in this class ? Is it that if two executor threads try to read the same broadcast variable we want it to be safe ?

@rxin
Copy link
Contributor Author

rxin commented Aug 19, 2014

Yes.

@shivaram
Copy link
Contributor

LGTM.Alright I looked through the code and tried to verify this change was safe. Just summarizing what I saw while walking through the code (in case it helps in the future)

  1. The first synchronized is in the constructor which two threads cannot call at the same time
  2. sendBroadcast is only called from constructor, so that is safe to remove
  3. receiveBroadcast is only called from a synchronized block in readObject, so that should be safe to remove (however this should be free in java, to re-acquire an already held lock)
  4. unpersist doesn't touch the class, so it is thread-safe (assuming BlockManager's readBroadcast is thread-safe -- I didn't check that)

@SparkQA
Copy link

SparkQA commented Aug 19, 2014

QA tests have finished for PR 2028 at commit 92c62a5.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Params(input: String = "data/mllib/sample_linear_regression_data.txt")
    • case class Params(input: String = "data/mllib/sample_linear_regression_data.txt")
    • case class Params(input: String = "data/mllib/sample_binary_classification_data.txt")

@rxin
Copy link
Contributor Author

rxin commented Aug 19, 2014

Ok I'm merging this in master & branch-1.1.

asfgit pushed a commit that referenced this pull request Aug 19, 2014
Author: Reynold Xin <rxin@apache.org>

Closes #2028 from rxin/torrentBroadcast and squashes the following commits:

92c62a5 [Reynold Xin] Revert the MEMORY_AND_DISK_SER changes.
03a5221 [Reynold Xin] [SPARK-3116] Remove the excessive lockings in TorrentBroadcast

(cherry picked from commit 8257733)
Signed-off-by: Reynold Xin <rxin@apache.org>
@asfgit asfgit closed this in 8257733 Aug 19, 2014
@JoshRosen
Copy link
Contributor

The first synchronized is in the constructor which two threads cannot call at the same time

Actually, I think two threads could get into the constructor simultaneously, but it wouldn't be a problem because the calls would be for different broadcast ids. ( new TorrentBroadcast is only called from TorrentBroadcastFactory.newBroadcast, which is only called from BroadcastManager.newBroadcast. This, in turn, is only called from SparkContext.broadcast, which isn't synchronized.)

@rxin rxin deleted the torrentBroadcast branch August 19, 2014 04:08
@shivaram
Copy link
Contributor

Hmm -- according to JLS [1], objects are only made available to other threads after the constructor finishes. So technically only one thread can call an object's constructor. However constructors are not thread safe if we expose a reference to this to other classes / threads [2].

In this case I didn't see the reference leaking, but I guess it doesn't hurt to track down the callers to verify things.

[1] http://docs.oracle.com/javase/specs/jls/se5.0/html/classes.html#8.8.3
[2] http://stackoverflow.com/questions/4880168/why-cant-java-constructors-be-synchronized

@JoshRosen
Copy link
Contributor

Regarding the first synchronized in that constructor, my understanding is that the original TorrentBroadcast.synchronized was guarding the writing of the broadcast value to the BlockManager. I think that this synchronization is unnecessary not because we don't leak this, but because BlockManager is thread-safe and the uniqueness of broadcast ids means that we'll never have parallel attempts to write the same block.

My point was that multiple TorrentBroadcast objects can be simultaneously under construction, but that it's safe because they'll have distinct ids; I think preventing leakage of a partially-constructed object is a separate issue that doesn't come into play here.

xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Author: Reynold Xin <rxin@apache.org>

Closes apache#2028 from rxin/torrentBroadcast and squashes the following commits:

92c62a5 [Reynold Xin] Revert the MEMORY_AND_DISK_SER changes.
03a5221 [Reynold Xin] [SPARK-3116] Remove the excessive lockings in TorrentBroadcast
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants