Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use include_str!() macro with Bazel-generated data files #459

Closed
dwtj opened this issue Oct 21, 2020 · 6 comments · Fixed by #468
Closed

Can't use include_str!() macro with Bazel-generated data files #459

dwtj opened this issue Oct 21, 2020 · 6 comments · Fixed by #468

Comments

@dwtj
Copy link

dwtj commented Oct 21, 2020

The Problem

The data attribute appears on each of the core Rust rules (e.g. rust_binary). The documentation for data says

This attribute can be used to specify any data files that are embedded into the library, such as via the include_str! macro.

I have found that include_str()! works only for "source" files (i.e. files included in the Bazel source tree). Unfortunately, the include_str!() macro does not work if the data files are generated by some other Bazel target.

Example

To demonstrate this, I've added a small test to my rules_rust fork.

In this test, one data file is a source file and one data file is generated by a genrule. The contents of both files are embedded in the rust_test via the include_str!() macro.

On my linux system, this rust_test fails to compile with the following rustc error:

error: couldn't read test/generated_data/include_str/generated_data.txt: No such file or directory (os error 2)
 --> test/generated_data/include_str/include_str.rs:9:13
  |
9 |             include_str!("generated_data.txt").trim()
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to previous error

Notice that rustc has no problem finding source_data.txt.

I believe that the problem can be seen by inspecting this rustc action's sandbox. On my Linux system, if I run the test with bazel test --sandbox_debug, and then change directory to <bazel_workspace_output_base>/sandbox/linux-sandbox/<action_id>/execroot/io_bazel_rules_rust, the root of the problem becomes clear: the library and the source data are sibling files

./test/generated_data/include_str/include_str.rs
./test/generated_data/include_str/source_data.txt

but the generated file is not

./bazel-out/k8-fastbuild/bin/test/generated_data/include_str/generated_data.txt

So, the generated data file is an input to the rustc Bazel action, but it isn't anywhere that the rustc executable can find it. Thus, a compiler error occurs.

Temporary Workaround

In my project where I actually want this feature I have an awful-dreadful-ugly hack to overcome this problem: I add a symlink from my source tree pointing to where I expect the generated file should be under the Bazel workspace's output base; then I add both the generated file and its symlink to the data attribute.

A Very Rough Idea for a Solution

I have a little bit of experience writing Bazel rules that also needed to deal with this problem. My solution there was for each target to add various ctx.actions.symlink() actions to place both generated files and source files into a new directory under their logical paths before calling the tool.

@mfarrugi
Copy link
Collaborator

mfarrugi commented Oct 21, 2020 via email

@dwtj
Copy link
Author

dwtj commented Oct 21, 2020

Looks like the same issue as #79 and #222.

Oh shucks. I think you're right. I completely missed these. Sorry for the possible duplicate. I'll read #79 in a bit.

Does this work if you add the generated data file to sources?

Nope. I just tried using a genrule to generate a .rs file

genrule(
    name = "generated_rust",
    outs = ["generated_rust.rs"],
    cmd_bash = 'echo "pub fn main() {}" > "$(@)"',
)

and then including generated_rust.rs in the srcs attr instead. But this gives the same compiler error,

error: couldn't read test/generated_data/include_str/generated_rust.rs: No such file or directory (os error 2)
 --> test/generated_data/include_str/include_str.rs:9:13
  |
9 |             include_str!("generated_rust.rs").trim()
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to previous error

Again, looking at the sandbox contents, this generated file is over in

bazel-out/k8-fastbuild/bin/test/generated_data/include_str/generated_rust.rs

To quote the Rust documentation for include_str!(),

The file is located relative to the current file (similarly to how modules are found).

@dae
Copy link
Contributor

dae commented Oct 21, 2020

I'm thinking that make variable expansion in rustc_env might be a cleaner way to solve this, and will try to get a PoC PR up soon. In the mean time, you can use make variable expansion to get one level up from a build script's OUT_DIR, which your script can write files to, eg:

genrule(
    name = "protobuf_gen",
    srcs = [
        "//proto:backend.proto",
    ],
    outs = [
        "backend_proto.rs",
    ],
    cmd = """\
PROTOC=$(location @com_google_protobuf//:protoc) \
OUT_DIR_PARENT=$(RULEDIR) \
SRCFILE=$(location //proto:backend.proto) \
$(location :generator_bin)""",
    tools = [
        ":generator_bin",
        "@com_google_protobuf//:protoc",
    ],
)

Then in your code:

include!(concat!(env!("OUT_DIR"), "/../backend_proto.rs"));

This requires the crate to have a build script in order for OUT_DIR to be set.

dae added a commit to ankitects/rules_rust that referenced this issue Oct 22, 2020
This makes it possible to pass in the path to generated files and
external tools.

This potentially closes bazelbuild#459, closes bazelbuild#454, and closes bazelbuild#79.

The docs seem to indicate there's precedent for this in rules_cc:
https://docs.bazel.build/versions/master/be/make-variables.html#predefined_label_variables
@dwtj
Copy link
Author

dwtj commented Oct 22, 2020

@dae Thanks for your workaround approach. This is sufficient for my needs at the moment.

I've added a full example of this workaround approach to a branch of my fork and reproduced it here:

load("//rust:rust.bzl", "rust_test")
load("//cargo:cargo_build_script.bzl", "cargo_build_script")

genrule(
    name = "generated_data",
    outs = ["generated_data.txt"],
    cmd_bash = 'echo "world" > "$(@)"',
)

cargo_build_script(
    name = "do_nothing_cargo_build_script",
    srcs = ["do_nothing_cargo_build_script.rs"],
)

rust_test(
    name = "include_str",
    srcs = [
        "include_str.rs",
    ],
    data = [
        "source_data.txt",
        "generated_data.txt",
    ],
    deps = [
        ":do_nothing_cargo_build_script"
    ],
)
#[cfg(test)]
mod test {

    #[test]
    pub fn include_str_test() {
        let hello_world = format!(
            "{}, {}!",
            include_str!("source_data.txt").trim(),
            include_str!(concat!(env!("OUT_DIR"), "/../generated_data.txt")).trim()
        );
        println!("{}", hello_world);
        assert_eq!("Hello, world!", hello_world);
    }
}

The key to this hack is that this crate's OUT_DIR just happens to be a sibling file to the generated file.


@dae One nitpick about your suggestion. You said

you can use make variable expansion to get one level up from a build script's OUT_DIR, which your script can write files to

However, I don't really see how either make variable substitution or using the cargo build script to write the files is really necessary here. My example, for instance, doesn't use either.

@jfirebaugh
Copy link
Contributor

jfirebaugh commented May 20, 2021

Is the workaround from @dwtj still the best way to include Bazel-generated data files, or was #468 intended to provide a better alternative? If so, is there an example of the preferred technique somewhere?

@jfirebaugh
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants