-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix deadlock when ryuk does not acknowledge filters #843
Fix deadlock when ryuk does not acknowledge filters #843
Conversation
3cf99b9
to
63293a9
Compare
@thammerl the change looks ok in general, thanks! |
@bsideup I added unit tests for the filter registration logic in 8e71ac6. In order to be able to implement unit tests, I had to extract a With 8e71ac6 the Travis CI build began to fail due to a failing Couchbase integration test. I'm not sure whether this is really related to my change. The gradle task UPDATE: The build became stable again with 7881d05. So I guess the tests are just flaky. |
@thammerl the failure was unrelated to the change and got fixed after I re-triggered the tests :) |
We have this out in a Release Candidate build (1.9.0-rc1) for anyone who is keen to try it! |
I've tried it on Codeship, but the error is still present
|
I'm still seeing the error locally when I run tests too. It's intermittent (sometimes it does work) but it's failing probably more than 50% of the time. I'm running Ubuntu 18.04 and when it fails it just spams
and a stacktrace in the output for a while before shutting down. The ryuk container lasts a while longer (up to a minute) and the only logs it outputs are
|
I am seeing the same issue as @toadzky with this error repeating over and over again when running a postgres container with testcontainers 1.14.3
|
@cmbernard333 That's an unrelated issue to this one. Please see #3166 |
This PR should fix the issues #581 and #780.
In case ryuk cannot parse the request, no acknowledgement is sent resulting in a deadlock as
ResourceReaper
ignores the end of the stream and will continue waiting for one. This can happen under heavy load.See relevant code in ryuk. From the documentation of
reader.readString
:The thread in
ResourceReaper
needs to check for the return valuenull
when callingBufferedReader#readLine
in order to check for the closed stream. From the official documentation of that method:I could reproduce the issue by building a patched version of testcontainers 1.8.3 release with extended logging resulting in the following log output (I truncated the file):
ryuk_endless_loop.txt
As you can see, the thread gets stuck trying to read from the ryuk socket waiting for an acknowledgement.
Can someone think of a way on how to implement an automated test for that?