Large actions with BwtB + dynamic execution never converge on local builds #23201
Labels
P3
We're not considering working on this, but happy to review a PR. (No assignee)
team-Local-Exec
Issues and PRs for the Execution (Local) team
type: bug
Description of the bug:
The scenario is like this:
When BwtB is disabled, dynamic execution in this scenario offers a massive speedup for step 3 and our developers really want to have quick turnaround under those types of C++ edits.
However, when we enable BwtB, the build of X never converges on local execution when the network is relatively slow. The problem goes like this:
When running this build multiple times, one would expect the chain of actions to happen purely locally at some point. But that's never the case: the build of X is always remote.
The problem here is triggered by the network being relatively slow: because some of the large inputs to X can never complete, X never has a chance to even start running locally. And thus even if running X locally would be faster overall, the remote build always finishes before Bazel has downloaded all inputs.
I think Bazel should either keep the partial downloads on disk and try to resume them later on a subsequent run, or continue the downloads in the background even if the local action is cancelled. I'm not sure what's preferable though. The former seems hard to implement and the latter can lead to problems in large builds with Bazel ending up with a long queue of downloads to process that may ultimately be useless...
Which category does this issue belong to?
Local Execution
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
N/A
What is the output of
bazel info release
?release 6.5.0
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD
?No response
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: