-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Today, remote executors look expect tool inputs to be linux_x86 (sometimes linux_arm64) and run them in the container image given by the action's exec_properties (requiring that image was registered in the terraform ahead of time)
As an example an action could specify constraints that match an exec platform triple like
constraint_values = [
"@platforms//os:linux",
"@platforms//cpu:aarch64",
"@//platforms/linkers:musl",
],
exec_properties = dict(
{
"OSFamily": "Linux",
# Must match playwright entry within deployments/prod/aws/silo/workflows.tf & deployments/prod/gcp/silo/workflows.tf
"container-image": "docker://mcr.microsoft.com/playwright:v1.50.1-jammy@sha256:8845c40cdade98fd7a6cd32df75bfc234cd52b3278f9cd1f9fe8d6291e48ea03",
},
),
Proposal: WASM executors
If an action is constrained to match a platform with constraint_values=["@platforms//os:wasi", "@platforms//cpu:wasm64"] then it should be scheduled on an executor that provides:
- a WASM runtime. https://wazero.io/ would be one choice that's pure Go and matches bb-remote-execution languages already in use
- performance benefits. The executor pool could keep caches for whatever the chosen runtime does to optimize. For example https://wasmtime.dev/ can dynamically produce native machine code, and this means an executor can run such an action nearly as fast as a native binary
Why this is desirable
Rule authors have to find tools they can spawn. In many cases, the ideal tool does not exist - rules_img is a recent example where authors decided to publish their own.
While Bazel could compile these tools on the end-users build, it's often undesirable because of the extra time taken when it's a cache miss (which is very often, see my rants on protoc for example)
Publishing pre-built binaries is our current answer. This requires a binary for "each platform" but in practice we only support linux/macos on two CPU architectures, for most rules. Creating these is difficult - the rust cross-compile story on LLVM is complex and we sometimes give up and run releases on multiple GitHub Actions OS. It also requires the toolchain dance in the end-users build to download only the tools needed for the execution platform.
Targeting only wasm-64 is easier. In addition, it's more likely that upstream tool maintainers would accept a small patch to support Bazel natively, then rules authors aren't in the position of having to maintain pre-builds of someone else's code. Many JS and Python tools are being re-written in Rust but only provide a CLI in the users language, and rulesets don't want to spawn a NodeJS or Python interpreter just to run a few lines of shim code before reaching across a WASM runtime to the actual implementation in a .wasm file.
Cache benefit
Cache keys (digestKey) contain the tools. Despite the existence of https://registry.build/flag/bazel/?flag=experimental_remote_scrubbing_config we haven't done the "science experiment" of getting MacOS developers to speed up builds with remote cache hits produced on their Linux CI workers.
We could do so, but the Bazel team advises strongly against this thing they invented for javac because all correctness guarantees are lost.
A WASM tool is of course portable to run on local aka host execution as well, provided that we have a runtime available (I'm prototyping it in https://registry.bazel.build/docs/bazel_lib/3.0.0#lib-run_binary-bzl right now). That means we can finally give Python and Java developers the fast local builds powered by the remote cache, which many of them expect Bazel will provide but are later disappointed (pretty frequently by me, on a sales call)