Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module.is_root is turning bzlmod into yet another WORKSPACE #21914

Closed
EdSchouten opened this issue Apr 6, 2024 · 4 comments
Closed

module.is_root is turning bzlmod into yet another WORKSPACE #21914

EdSchouten opened this issue Apr 6, 2024 · 4 comments
Labels
team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug under investigation

Comments

@EdSchouten
Copy link
Contributor

EdSchouten commented Apr 6, 2024

Description of the bug:

Some time last week I migrated Buildbarn from WORKSPACE to MODULE.bazel. This process went smoother than I had expected up front, so kudos to the folks implementing this.

Unfortunately, I came to the conclusion that the MODULE.bazel approach still isn't as 'modular' as I had hoped.

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

For example, let's create a toy project that uses rules_rust:

$ cat > MODULE.bazel << 'EOF'
module(name = "foo")
bazel_dep(name = "rules_rust", version = "0.41.1")
rust = use_extension("@rules_rust//rust:extensions.bzl", "rust")
rust.toolchain(edition = "2021")
EOF
$ cat > BUILD.bazel << 'EOF'
load("@rules_rust//rust:defs.bzl", "rust_binary")
rust_binary(
    name = "hello",
    srcs = ["hello.rs"],
)
EOF
cat > hello.rs << 'EOF'
fn main() {
    println!("Hello World!");
}
EOF
$ bazel run //:hello
...
Hello World!

Now let's consider that some user wants to integrate this project into their organisation in the form of a tarball, so they create a Bazel project that looks like this:

cat > MODULE.bazel << 'EOF'
module(name = "bar")
bazel_dep(name = "foo", version = "0.0.0")
bazel_dep(name = "rules_pkg", version = "0.10.1")
# git_override, archive_override, or local_path_override for "foo" goes here.
EOF
cat > BUILD.bazel << 'EOF'
load("@rules_pkg//:pkg.bzl", "pkg_tar")
pkg_tar(
    name = "hello_tar",
    srcs = ["@foo//:hello"],
)
EOF

When they try to build their tarball, they get presented with the following error message:

$ bazel build //...  
...
ERROR: Traceback (most recent call last):
...
Error in fail: 
Your transitive dependency foo is using rules_rust, so you need to define a rust toolchain.
To do so, you will need to add the following to your root MODULE.bazel. For example:

bazel_dep(name = "rules_rust", version = "<rules_rust version>")

rust = use_extension("@rules_rust//rust:extensions.bzl", "rust")
rust.toolchain(
    edition = "2021",
    versions = ["1.70.2"],
)
use_repo(rust, "rust_toolchains")
register_toolchains("@rust_toolchains//:all")
...

So now the user, who couldn't care less that my tool is written in Rust, needs to copy-paste code from the original MODULE.bazel file into their own project, and make sure that that code remains up to date. In my opinion this completely undermines the problem that MODULE.bazel was trying to solve. It adds unnecessary complexity when people try to use Bazel as a meta-build system for tying multiple projects together.

rules_rust isn't the only set of rules that has such weird restrictions. I have also observed this to be an issue with toolchains_llvm.

Which operating system are you running Bazel on?

All of them

What is the output of bazel info release?

7.1.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

is_root was introduced here: #15815

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

In my opinion either one of the two following things needs to be done:

  1. Non-intrusive: change is_root to return True if and only if it is the root of all modules that depend on a given module instead of the literal root of the project.
  2. Deprecate is_root and introduce a way where rules authors can compare modules along a partial order based on parent/child relationships.

In my opinion archive_override and git_override should be given a similar treatment. Namely, it's completely valid for them to get ignored when declared in a child module when the parent also has a dependency on it. But ignoring it if none of the parents/siblings/... depend on it either? That's just cruel.

@fmeum
Copy link
Collaborator

fmeum commented Apr 6, 2024

Thanks for raising this discussion. I do see great potential for Bazel working well as a meta build system in this way, but I think that this kind of modularity can only be realized if we properly design the domain-specific interactions.

I think that the ultimate root cause of the type of problem you are seeing is that "toolchains are like WORKSPACE": Most toolchains are highly configurable, making different versions hard to reconcile, and registrations have "first registered wins" semantics. Many module extensions, apparently including rules_rust and also toolchains_llvm, have taken the if not module.is_root: fail or ignore approach - most likely because they don't want to commit to any particular alternative approach for toolchain management yet. This seems prudent to me since it allows for improving modularity later without potentially backwards incompatible changes.

A pattern that I could see working here is to:

  • define and register a "good enough" default toolchain in rules_rust that is updated with rules_rust versions and can be relied upon by all modules without them having to register any toolchain of their own and
  • allow the root module to register and finetune rust toolchains as needed.

This sidesteps the issue of reconciling toolchain requirements by multiple transitive dependencies while still allowing rulesets to use Rust tools without manual setup required from the root module. Would that work in your case? @illicitonion @UebelAndre in case they want to provide their input on this.

Non-intrusive: change is_root to return True if and only if it is the root of all modules that depend on a given module instead of the literal root of the project.

While that works in your particular example, it looks to me as if this would stop working in the typical diamond situation with dependency chains A -> B -> rules_rust and A -> C -> rules_rust. This is an example of what I described above: If B and C have opinions about Rust toolchains, it's essentially impossible to reconcile them automatically. Instead, either A is tasked with defining a toolchain (the current situation) or rules_rust provides a toolchain good enough for B and C and tunable by A if desired (the "improved" approach described above).

Deprecate is_root and introduce a way where rules authors can compare modules along a partial order based on parent/child relationships.

While it's true that module extensions can currently only query the BFS order of modules using them and not the dependency edges between them, I'm not sure whether this extra information is what extensions are missing to become more modular: It already seems difficult to reconcile the fully featured toolchain requirements of two module that aren't comparable with respect to this partial order. I would rather think that domain-specific information or restrictions (say, editions/language versions/etc. instead of arbitrary toolchain features) could help do better here (say use Minimum Version Selection on the toolchain versions).

@satyanandak satyanandak added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label Apr 8, 2024
@illicitonion
Copy link
Contributor

I think that the ultimate root cause of the type of problem you are seeing is that "toolchains are like WORKSPACE": Most toolchains are highly configurable, making different versions hard to reconcile, and registrations have "first registered wins" semantics. Many module extensions, apparently including rules_rust and also toolchains_llvm, have taken the if not module.is_root: fail or ignore approach - most likely because they don't want to commit to any particular alternative approach for toolchain management yet. This seems prudent to me since it allows for improving modularity later without potentially backwards incompatible changes.

A pattern that I could see working here is to:

  • define and register a "good enough" default toolchain in rules_rust that is updated with rules_rust versions and can be relied upon by all modules without them having to register any toolchain of their own and
  • allow the root module to register and finetune rust toolchains as needed.

This sidesteps the issue of reconciling toolchain requirements by multiple transitive dependencies while still allowing rulesets to use Rust tools without manual setup required from the root module. Would that work in your case? @illicitonion @UebelAndre in case they want to provide their input on this.

cc @matts1 who probably has the most context here, and introduced this requirement in bazelbuild/rules_rust#2021 including this snippet:

https://github.com/bazelbuild/rules_rust/blob/1a053cfbf9a28ef09a5594bf69a9360c8e59bc61/rust/extensions.bzl#L42-L49

    # Toolchain configuration is only allowed in the root module.
    # It would be very confusing (and a security concern) if I was using the
    # default rust toolchains, then when I added a module built on rust, I was
    # suddenly using a custom rustc.
    root = _find_root(module_ctx)

    if not root:
        fail(_TRANSITIVE_DEP_ERR % module_ctx.modules[0].name)

@matts1
Copy link
Contributor

matts1 commented Apr 8, 2024

I definitely understand your issues with it. I do also take issue with it, but I don't currently have any ideas that can improve on it - I'm open to suggestions.

I believe that the biggest problem with WORKSPACE.bazel is that it didn't work very well when you had collisions, and implementing your idea as-is would simply make those collisions come back.

To me, there were pros and cons of doing it either way. I'll outline a few of them:

Pros of explicitly doing it in root workspace:

  • Can easily make implicit default toolchains implicit later down the line if we decide it's better
  • Completely unambiguous - if you have multiple packages depending on rules_rust, you know that there is still only a single registered toolchain.

Pros of implicitly using the toolchain from the modules that you depend on:

  • Simpler to write / less verbose
  • Don't have to think about your transitive dependencies

The minimum version selection on the toolchain versions is an interesting idea, and might work for simpler use cases, it starts failing when you do anything more complex.

  • We have a few rust toolchains that we use, some of which define custom sets of rustc flags, and so merging the toolchains would be bad, and having them fight and have one win would result in spooky action at a distance based on the order in which you declared your bazel deps.
  • If we have the diamond dependency root -> foo, bar -> rules_rust, I could see a world where we want to have a transition to foo's rust toolchain whenever we depend on a target in foo, and a transition to bar's rust toolchain whenever we depend on a target in bar.

I think my conclusion is that what you're suggesting would indeed make things slightly simpler for the simple use cases, but it'd make things much harder for the more complex use cases. I personally don't think the tradeoff is worth it, but if anyone has suggestions that make this easier.

FWIW, in rules_rust, we only allow the root module to configure the toolchain, but IIRC there is a default configuration. So you should just need to register that default toolchain. I haven't ever tried your use case, so I'm not actually too familiar with how it currently works.

@Kevintjeb
Copy link

We hit this issue in rules_jsonnet in the following PR.

Our goal is to allow users to use either a Go, Cpp or Rust based Jsonnet
compiler. Depending on which compiler the user picks, Bazel will then correctly
compile it for them. This has been working nicely while rules-jsonnet only
supported Go and Cpp implementations.

The PR introduced Rust, and thus also rules_rust to be able to compile it.
To my surprise I observed the following behaviour for the users of this patch:

  • Users of the module rules_jsonnet must now always include a rules_rust
    toolchain. Regardless if they're using Rust or not.
  • Including a rules_rust dependency and toolchain in rules_jsonnet is not
    enough, as the rules_rust implementation requires the root module to define the
    toolchain.
  • Not using anything related to Rust requires a toolchain to be specified. A
    bazel_dep(name = "rules_rust") is enough to trigger the error.
  • A great error message from rules_rust on what to add and where. That was nice!

Unfortunately this behaviour blocks us from merging the PR, because we don't
want to introduce a breaking change / make users aware of our dependency
(transitive for them) on rules_rust if they only use the defaults of
rules_jsonnet with the Go compiler.

I understand that the current setup in rules_rust is a sane default, as this
allows for improvements down the line without breaking backwards compatibility.
However, I wonder what we can do to improve the setup for e.g. rules_jsonnet,
where it is similar to rules_go for example:

  • rules_jsonnet has a dependency on rules_go, because there is a Go Jsonnet
    compiler. If the user registers the intent to use the Go compiler, then
    rules_go will use the default toolchain (unless specified otherwise by the
    user) to compile it. When the user does not use Go but the Cpp implementation,
    the rules_go module does not throw an error.

We're open to a temporary solution that allows us to move forward with
rules_jsonnet and not put pressure on rules_rust to redefine how toolchains are
supposed to work with Bzlmod. I'd also be keen to chat with the rules_rust
maintainers as a user and see how we can improve the Bzlmod setup as a whole.

@bazelbuild bazelbuild locked and limited conversation to collaborators Apr 16, 2024
@meteorcloudy meteorcloudy converted this issue into discussion #22024 Apr 16, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug under investigation
Projects
None yet
Development

No branches or pull requests

9 participants