-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bazel 5.0.0rc2 hangs when both --bes_backend and --build_event_binary_file specified #14363
Comments
cc: @coeuvre |
@meteorcloudy probably a release blocker? |
Just found that part of my report was caused by misbehaving BES. |
Can you confirm this doesn't happen with Bazel 4.2.1? |
Yes, it works with 4.2.1 |
/cc @coeuvre Any guess on how this is caused and how to fix? |
I didn't see any suspicious commits between 5.0.0rc1 and 5.0.0rc2. Can you please share a minimal repro? |
After some time I have found:
Steps to reproduce:
wget 'https://plugins.jetbrains.com/plugin/download?rel=true&updateId=147375' -O intellij_bazel.zip
unzip intellij_bazel.zip
after some time:
USE_BAZEL_VERSION=5.0.0rc2 bazel build //src/... //scala/... //private/... //scala_proto/... //junit/... //jmh/... --remote_cache=grpc://localhost:9092 --bes_backend=grpc://localhost:1985 '--override_repository=intellij_aspect=$HOME/ijwb/aspect' --output_groups=intellij-resolve-java-direct-deps,intellij-info-generic,intellij-info-java-direct-deps --aspects=@intellij_aspect//:intellij_info_bundled.bzl%intellij_info_aspect --remote_max_connections=5 --build_event_binary_file=build.log And it hangs
|
Thanks for the repro! I am looking into the fix. |
Found the root cause. Working on the fix. |
With recent change to limit the max number of gRPC connections by default, acquiring a connection could suspend a thread if there is no available connection. gRPC calls are scheduled to a dedicated background thread pool. Workers in the thread pool are responsible to acquire the connection before starting the RPC call. There could be a race condition that a worker thread handles some gRPC calls and then switches to a new call which will acquire new connections. If the number of connections reaches the max, the worker thread is suspended and doesn't have a chance to switch to previous calls. The connections held by previous calls are, hence, never released. This PR changes to not use blocking get when acquiring gRPC connections. Fixes bazelbuild#14363. Closes bazelbuild#14416. PiperOrigin-RevId: 416282883
With recent change to limit the max number of gRPC connections by default, acquiring a connection could suspend a thread if there is no available connection. gRPC calls are scheduled to a dedicated background thread pool. Workers in the thread pool are responsible to acquire the connection before starting the RPC call. There could be a race condition that a worker thread handles some gRPC calls and then switches to a new call which will acquire new connections. If the number of connections reaches the max, the worker thread is suspended and doesn't have a chance to switch to previous calls. The connections held by previous calls are, hence, never released. This PR changes to not use blocking get when acquiring gRPC connections. Fixes bazelbuild#14363. Closes bazelbuild#14416. PiperOrigin-RevId: 416282883 (cherry picked from commit ad663a7)
With recent change to limit the max number of gRPC connections by default, acquiring a connection could suspend a thread if there is no available connection. gRPC calls are scheduled to a dedicated background thread pool. Workers in the thread pool are responsible to acquire the connection before starting the RPC call. There could be a race condition that a worker thread handles some gRPC calls and then switches to a new call which will acquire new connections. If the number of connections reaches the max, the worker thread is suspended and doesn't have a chance to switch to previous calls. The connections held by previous calls are, hence, never released. This PR changes to not use blocking get when acquiring gRPC connections. Fixes #14363. Closes #14416. PiperOrigin-RevId: 416282883
With recent change to limit the max number of gRPC connections by default, acquiring a connection could suspend a thread if there is no available connection. gRPC calls are scheduled to a dedicated background thread pool. Workers in the thread pool are responsible to acquire the connection before starting the RPC call. There could be a race condition that a worker thread handles some gRPC calls and then switches to a new call which will acquire new connections. If the number of connections reaches the max, the worker thread is suspended and doesn't have a chance to switch to previous calls. The connections held by previous calls are, hence, never released. This PR changes to not use blocking get when acquiring gRPC connections. Fixes bazelbuild#14363. Closes bazelbuild#14416. PiperOrigin-RevId: 416282883
Description of the problem / feature request:
--bes_backend specified
in our .bazelrc--build_event_binary_file
added by intellij pluginwhen specifying both bazel sometimes hang.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
What operating system are you running Bazel on?
MacOS, Linux
What's the output of
bazel info release
?release 5.0.0rc2
Have you found anything relevant by searching the web?
Nothing similar
Any other information, logs, or outputs that you want to share?
Thread dump of bazel when it hangs:
https://gist.github.com/darl/94141514150e09b2460030f283cc9f21
The text was updated successfully, but these errors were encountered: