Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached rules with directory outputs fail to build when build without the bytes is enabled #18579

Closed
gabrielrussoc opened this issue Jun 5, 2023 · 5 comments
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@gabrielrussoc
Copy link

gabrielrussoc commented Jun 5, 2023

Description of the bug:

Rules outputting directories (which is not sane, but unfortunately happens) that hit the remote cache fail to build when build with the bytes is enabled.

I hit this when trying remote execution and it surprised me that tagging the genrule with no-remote-exec immediately fixes the issue, only to break a few builds later when the local outputs are gone and it starts to hit the remote cache. Note that this issue happens without remote execution at all.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Create a single BUILD file with:

genrule(
    name = "repro",
    outs = ["dir"],
    cmd = "mkdir $@; echo something > $@/a.txt",
)

build it:

$ bazel build --disk_cache=/tmp/bazel_disk_cache --remote_download_minimal //:repro

clear the local outputs:

$ bazel clean

build again (should hit the remote cache):

$ bazel build --disk_cache=/tmp/bazel_disk_cache --remote_download_minimal //:repro
...
ERROR: /Users/gabriel.russo/workspace/bazel_repro/BUILD:1:8: declared output 'dir' was not created by genrule. This is probably because the genrule actually didn't create this output, or because the output was a directory and the genrule was run remotely (note that only the contents of declared file outputs are copied from genrules run remotely)

Which operating system are you running Bazel on?

Linux and macOS

What is the output of bazel info release?

release 6.2.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@tjgq
Copy link
Contributor

tjgq commented Jun 5, 2023

Can you use a custom Starlark rule with declare_directory instead? genrule was never intended to produce anything other than regular files; it appears to work but it's subtly broken. It's unlikely that this will get fixed; if anything, I'd rather make it an error in the "with the bytes" case as well.

@Pavank1992 Pavank1992 added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Jun 6, 2023
@gabrielrussoc
Copy link
Author

gabrielrussoc commented Jun 6, 2023

@tjgq I actually hit the problem with a custom rule that declares a attr.output and writes to it (the genrule was just to simplify the repro):

rule.bzl:

def _impl(ctx):
    ctx.actions.run_shell(
        outputs = [ctx.outputs.out],
        arguments = [ctx.outputs.out.path],
        command = """
            mkdir $1;
            echo "something" > $1/a.txt
        """
    )

custom_rule = rule(
    implementation = _impl,
    attrs = {
        "out": attr.output(),
    }
)

BUILD:

load(":rule.bzl", "custom_rule")

custom_rule(
    name = "custom",
    out = "directory",
)

Same thing happens:

$ bazel build --disk_cache=/tmp/bazel_disk_cache --remote_download_toplevel //:custom
$ bazel clean
$ bazel build --disk_cache=/tmp/bazel_disk_cache --remote_download_toplevel //:custom
...
ERROR: /Users/gabriel.russo/workspace/bazel_repro/BUILD:9:12: output 'directory' was not created
ERROR: /Users/gabriel.russo/workspace/bazel_repro/BUILD:9:12: Action directory failed: not all outputs were created or valid
Target //:custom failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.101s, Critical Path: 0.00s
INFO: 2 processes: 1 disk cache hit, 1 internal.

I'm looking into using declare_directory instead

I'd rather make it an error in the "with the bytes" case as well.

I see. If we can't fix it, surely it should fail on both cases to avoid the current confusing behaviour. In fact this broke the build for everyone since the first commit worked fine but after it merged it started failing for everyone

@tjgq
Copy link
Contributor

tjgq commented Jun 6, 2023

attr.output and attr.output_list have the same fundamental issue as genrule.outs - they're always considered files, not directories (see also the related #14647).

If we can't fix it, surely it should fail on both cases to avoid the current confusing behaviour.

I agree. I'll look into making this an error (behind an --incompatible flag).

copybara-service bot pushed a commit that referenced this issue Jun 7, 2023
…ule actions.

The motivation behind this change is to eventually promote the warning to an error in the output case; this avoids surprising behavior such as described in #18579. I'm less confident that the input case (source directories) can be similarly promoted to an error, but I'm still extending the warning to all actions for symmetry.

PiperOrigin-RevId: 538470433
Change-Id: Ic6c8a4cb353dda5f6397d286fe2e6faee28f896c
amishra-u pushed a commit to amishra-u/bazel that referenced this issue Jun 7, 2023
…ule actions.

The motivation behind this change is to eventually promote the warning to an error in the output case; this avoids surprising behavior such as described in bazelbuild#18579. I'm less confident that the input case (source directories) can be similarly promoted to an error, but I'm still extending the warning to all actions for symmetry.

PiperOrigin-RevId: 538470433
Change-Id: Ic6c8a4cb353dda5f6397d286fe2e6faee28f896c
@tjgq
Copy link
Contributor

tjgq commented Jun 12, 2023

To close the loop here: I've added an --incompatible_disallow_unsound_directory_outputs flag to forbid an action from creating a directory where a file is expected. I will try to have it flipped in Bazel 7. This is tracked at #18646.

@tjgq
Copy link
Contributor

tjgq commented Aug 25, 2023

I'm going to close this since the actual fix here would be something like #19030.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

4 participants