Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBE actions not properly canceled when BES upload times out #19496

Open
minor-fixes opened this issue Sep 12, 2023 · 5 comments
Open

RBE actions not properly canceled when BES upload times out #19496

minor-fixes opened this issue Sep 12, 2023 · 5 comments
Assignees
Labels
stale Issues or PRs that are stale (no activity for 30 days) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@minor-fixes
Copy link
Contributor

minor-fixes commented Sep 12, 2023

Description of the bug:

In (at least some) cases, BES upload timing out does not seem to trigger proper cancellation of actions issued over RBE. In my case, I'm seeing:

FAILED: Build did NOT complete successfully
WARNING: Uploading BEP referenced local file: UNAVAILABLE: io exception
WARNING: Uploading BEP referenced local file /builder/home/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/command.profile.gz hash: "2aa3acf372336923a8ed69b9f0878489764153c0de2ac3039b10c92296bcdf2b"
size_bytes: 3461595
: UNAVAILABLE: io exception
FAILED: Build did NOT complete successfully
ERROR: The Build Event Protocol upload timed out. com.google.common.util.concurrent.TimeoutFuture$TimeoutFutureException: Timed out: NonCancellationPropagatingFuture@2eac302a[status=PENDING, info=[delegate=[SettableFuture@6baebb9b[status=PENDING]]]]

and noticing that actions continue to run in Buildbarn despite the bazel client (and indeed the VM it was running on) go away. From a cursory look at the BES code, it would seem that abruptly exiting on BES timeout is the intent.

It's possible that only RBE actions that required a retry are not properly canceled - this seems to be a pattern, and I don't yet have a counterexample.

Since bazel is exiting semi-cleanly (i.e. not getting SIGKILL) it should cancel actions executing via RBE, in order to free up remote resources.

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Unfortunately, I'm unable to reproduce this issue outside of a nightly long-running build; any tips on gathering more debugging info would be greatly appreciated!

Which operating system are you running Bazel on?

linux

What is the output of bazel info release?

release 6.3.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

(private repository)

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No; no ideas from those on Slack yet either

Any other information, logs, or outputs that you want to share?

No response

@tjgq
Copy link
Contributor

tjgq commented Sep 12, 2023

Running with --verbose_failures should reveal the strack trace associated with the UNAVAILABLE: io exception and might provide additional clues.

You also mention that RBE action retries might be a factor - how did you determine that an action was retried?

Regarding cancellation: my understanding is that, upon a build failure (which an abrupt exit qualifies as), Bazel cancels any pending gRPC requests to the remote execution server. But it's up to the remote side whether to actually abort the execution (aborting is not necessarily the best option; letting the action finish could still be advantageous to populate the remote cache for a subsequent build). So it would also help to know what's the expected behavior for Buildbarn in this situation.

@minor-fixes
Copy link
Contributor Author

minor-fixes commented Sep 12, 2023

Running with --verbose_failures should reveal the strack trace associated with the UNAVAILABLE: io exception and might provide additional clues.

Thanks! I'll give this a shot and report back tomorrow

You also mention that RBE action retries might be a factor - how did you determine that an action was retried?

By looking at RPC traces - I can see that a particular action was started via Execute at time X, trace finishes at time Y, and a second Execute trace for the same target + mnemonic starts at time Y; at the same time, I can see from remote worker pod metrics that a worker pod went away at time Y, and can infer that the pod went away unexpectedly (OoM, cloud spot instance reclaim, etc.), buildbarn reported the failure back to the bazel client, and bazel issued a retry (not conclusive admittedly; but this seems highly probable?)

it would also help to know what's the expected behavior for Buildbarn in this situation

From what I understand, Buildbarn will continue executing actions as long as a client that is executing it has an open session. Buildbarn will deduplicate identical actions, but that is not a factor here, as:

  • these actions are unlikely to be deduplicated (they come from targets which are particularly heavy and usually only are run in this build)
  • I can see from Buildbarn UIs that the actions are still associated with the invocation ID of the build that has gone away

If no clients care about the action, Buildbarn will time out the action in relatively short time (currently 60s for us; the action will otherwise continue to run for 1+ hours)

Additionally, when I Ctrl-C an RBE build, typically I can see RPCs error with a deadline exceeded-type error; when this long-running build gets SIGINT and exits, I don't see cancellation reflected in RPC traces in this way.

We have previously run builds in an environment where on build timeout the VM is abruptly killed; this would cause actions to continue running on buildbarn, because the TCP sessions weren't properly closed (might be using the wrong words here) - I may be able to fix this with TCP keepalive configuration on buildbarn, but it seems that maybe bazel is not shutting down connections as expected?

@iancha1992 iancha1992 added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Sep 12, 2023
@minor-fixes
Copy link
Contributor Author

Here's the full stacktrace:

FAILED: Build did NOT complete successfully
WARNING: Uploading BEP referenced local file: UNAVAILABLE: io exception
java.io.IOException: findMissingBlobs(1) for bes-upload: UNAVAILABLE: io exception
	at com.google.devtools.build.lib.remote.GrpcCacheClient.lambda$findMissingDigests$1(GrpcCacheClient.java:222)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:203)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AggregateFuture.handleException(AggregateFuture.java:200)
	at com.google.common.util.concurrent.AggregateFuture.collectValueFromNonCancelledFuture(AggregateFuture.java:270)
	at com.google.common.util.concurrent.AggregateFuture.lambda$init$0(AggregateFuture.java:149)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:214)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:142)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:214)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:142)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:104)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:55)
	at com.google.devtools.build.lib.remote.util.RxFutures$2.onError(RxFutures.java:257)
	at io.reactivex.rxjava3.internal.operators.single.SingleFlatMap$SingleFlatMapCallback$FlatMapSingleObserver.onError(SingleFlatMap.java:117)
	at io.reactivex.rxjava3.internal.operators.single.SingleUsing$UsingSingleObserver.onError(SingleUsing.java:180)
	at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.tryOnError(SingleCreate.java:95)
	at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:81)
	at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onFailure(RxFutures.java:172)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
	at com.google.devtools.build.lib.remote.NetworkTimeInterceptor$NetworkTimeCall$1.onClose(NetworkTimeInterceptor.java:81)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.Status.asRuntimeException(Status.java:535)
	... 14 more
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: buildbarn.local.enfabrica.net/127.0.0.1:8002
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at io.netty.channel.unix.Errors.newConnectException0(Errors.java:155)
	at io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:128)
	at io.netty.channel.unix.Socket.finishConnect(Socket.java:320)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)
WARNING: Uploading BEP referenced local file /builder/home/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/command.profile.gz hash: "5d21225aa1501a67f7de355070ff27823546449a8a3cc3343ad4defc38d961e9"
size_bytes: 3319577
: UNAVAILABLE: io exception
java.io.IOException: Error while uploading artifact with digest '5d21225aa1501a67f7de355070ff27823546449a8a3cc3343ad4defc38d961e9/3319577'
	at com.google.devtools.build.lib.remote.ByteStreamUploader.lambda$uploadBlobAsync$0(ByteStreamUploader.java:171)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:203)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:124)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:104)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:214)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:142)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:214)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:142)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:104)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:104)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:851)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:214)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:190)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:142)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:104)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:55)
	at com.google.devtools.build.lib.remote.util.RxFutures$2.onError(RxFutures.java:257)
	at io.reactivex.rxjava3.internal.operators.single.SingleFlatMap$SingleFlatMapCallback$FlatMapSingleObserver.onError(SingleFlatMap.java:117)
	at io.reactivex.rxjava3.internal.operators.single.SingleUsing$UsingSingleObserver.onError(SingleUsing.java:180)
	at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.tryOnError(SingleCreate.java:95)
	at io.reactivex.rxjava3.internal.operators.single.SingleCreate$Emitter.onError(SingleCreate.java:81)
	at com.google.devtools.build.lib.remote.util.RxFutures$OnceSingleOnSubscribe$1.onFailure(RxFutures.java:172)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
	at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.Status.asRuntimeException(Status.java:535)
	... 10 more
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: buildbarn.local.enfabrica.net/127.0.0.1:8002
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
	at io.netty.channel.unix.Errors.newConnectException0(Errors.java:155)
	at io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:128)
	at io.netty.channel.unix.Socket.finishConnect(Socket.java:320)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)
FAILED: Build did NOT complete successfully
ERROR: The Build Event Protocol upload timed out. com.google.common.util.concurrent.TimeoutFuture$TimeoutFutureException: Timed out: NonCancellationPropagatingFuture@327e51a7[status=PENDING, info=[delegate=[SettableFuture@777cc98c[status=PENDING]]]]

The portion findMissingBlobs(1) for bes-upload is strange to me - is it calling FindMissingBlobs on the RBE endpoint, or the BES endpoint?

@coeuvre
Copy link
Member

coeuvre commented Sep 14, 2023

It's calling FindMissingBlobs on RBE endpoint because it needs to upload BEP referenced artifacts to CAS.

Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues or PRs that are stale (no activity for 30 days) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

7 participants