Skip to content

Conversation

tomwilkie
Copy link
Contributor

This explains #881 - if we try and take a write lock during this window, we lock up the reporter thread, and that causes a bunch of data structures to grow linearly.

We should also stop these DSes growing linearly.

@2opremio
Copy link
Contributor

2opremio commented Feb 1, 2016

Looking at the sources, of RLock I can confirm that a goroutine acquiring the read-lock will cause a deadlock calling RLock again if another gorouting calls Lock between the initial read-lock acquisition and the RLock call. This is because pending write locks have a higher priority than read locks (even for a goroutine which already had acquired the read-lock).

This is not explicit in the documentation and not intuitive either, in fact, the docs made me originally think that this was OK:

The lock can be held by an arbitrary number of readers or a single writer.

so I will create a PR upstream.

But first, I will spawn this fix in ECS to see if the problem has gone away.

@tomwilkie tomwilkie mentioned this pull request Feb 1, 2016
@2opremio
Copy link
Contributor

2opremio commented Feb 1, 2016

This seems to fix #881 since I cannot reproduce the deadlock (and high memory consumption of the probe) anymore. However, although there is no deadlock, the behaviour of scope is still pretty erratic in the ECS demo. Containers appear an disappear all the time.

2opremio pushed a commit that referenced this pull request Feb 1, 2016
Taking a read lock twice only works most of the time.
@2opremio 2opremio merged commit ed56261 into master Feb 1, 2016
@2opremio 2opremio deleted the 881-probe-locking branch February 1, 2016 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants