Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go rules run cpp toolchain from cgo_context_data on an incompatible execution platform #4127

Open
jesses-canva opened this issue Oct 2, 2024 · 0 comments · May be fixed by #4128
Open

Go rules run cpp toolchain from cgo_context_data on an incompatible execution platform #4127

jesses-canva opened this issue Oct 2, 2024 · 0 comments · May be fixed by #4128

Comments

@jesses-canva
Copy link

jesses-canva commented Oct 2, 2024

What version of rules_go are you using?

v0.47

What version of gazelle are you using?

v0.36.0

What version of Bazel are you using?

7.3.0

Does this issue reproduce with the latest releases of all the above?

Untested

What operating system and processor architecture are you using?

Ubuntu 22.04 x86_64

Any other potentially useful information about your toolchain?

Remote execution

What did you do?

We have a remote execution set up with two execution platforms (registered with --host_platform and --extra_execution_platforms). There is a Go toolchain registered and compatible with both execution platforms, however only the second execution platform has a compatible cpp toolchain registered.

Execution platform    | Available toolchains
----------------------+----------------------------------------------------------------------------
1. A                  | @io_bazel_rules_go//go:toolchain
2. B                  | @io_bazel_rules_go//go:toolchain, @bazel_tools//tools/cpp:toolchain_type

What we observed is that rules_go will try to use the cpp toolchain, that is only compatible with the second execution platform, in an action running on the first one, failing because the cpp toolchain isn't installed on that platform.

Note the execution platform ordering is important, the error we get is because Bazel prefers the first one if it thinks it is compatible with the action.

What did you expect to see?

Successful build

What did you see instead?

ERROR: /var/lib/blah/bazel/a8584ebfb3d6ff0dfe61abfbfa5bb4d3/external/io_bazel_rules_go/BUILD.bazel:42:7: GoStdlib external/io_bazel_rules_go/stdlib_/pkg failed: (Exit 1): builder failed: error executing GoStdlib command (from target @@io_bazel_rules_go//:stdlib)
...
cgo: C compiler "/usr/bin/clang-13" not found: exec: "/usr/bin/clang-13": stat /usr/bin/clang-13: no such file or directory

Discussion

The underlying cause in this case is that the stdlib target depends on the cgo_context_data target here, and cgo_context_data has a dependency on the cpp toolchain here, so its execution platform is constrained to the platforms compatible with the selected toolchain, but instead of executing the compiler it returns the path to it in its provider, here.

In this case the rule that actually executes the compiler is stdlib, but that has no dependency on the cpp toolchain so Bazel doesn't know it has to run on the a platform that is compatible with the cpp toolchain. So it defaults to the first platform and then fails because /usr/bin/clang-13 doesn't exist.

The patch to rules_go we have used is to add the toolchain dependency to all the rules that depend on cgo_context_data, so they also have their execution platform constrained to the platforms compatible with the selected cpp toolchain. (#4128)

An even better fix would be to make cgo_context_data a toolchain itself so that it influences the execution platform of the rules that depend on it, but that would be a bigger diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant