Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: prune target list based on dirty files #8521

Closed
EricBurnett opened this issue May 30, 2019 · 6 comments
Closed

FR: prune target list based on dirty files #8521

EricBurnett opened this issue May 30, 2019 · 6 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: feature request

Comments

@EricBurnett
Copy link

Description of the problem / feature request:

Bazel should be able to take a list of "dirty" files as input, and prune the target list based on it.

Feature requests: what underlying problem are you trying to solve with this feature?

In CI contexts bazel often ends up doing a lot of unnecessary/redundant work. For example, in CI a common pattern is to run

bazel test //...

on a clean worker. This requires computing the full build graph, downloading all remote repositories, getting cache hits (and currently downloading all outputs) for every action in the graph, etc. As a result, it can be unnecessarily slow for mostly-cached builds, since most of this work is unrelated to the handful of files that have actually been changed or the targets that transitively depend on them. Especially in monorepo contexts, where a PR may only affect one corner of the codebase.

What a lot of users end up doing is hacking together something of their own using bazel query - something logically equivalent to

bazel test `bazel query rdeps(//..., attr('srcs', <dirty files>))`

...except hopefully with proper handling of non-src dependencies (e.g. 'data', etc), BUILD/bzl files, WORKSPACE files, removed files, etc.

Unsurprisingly, this is hard to get right, and awkward to have to insert into a build flow. So it's a power feature only for users sufficiently motivated to improve build performance. But bazel knows how to do this internally - every incremental build is doing effectively this computation based on the files in the workspace that were touched or added or removed.

The request, then, is to build a proper feature around this - support something like:

git diff --name-only base...pr > dirty_files.txt
bazel test //... --prune_file=dirty_files.txt

so that CI systems can easily and scalably do pruned test runs, and without jeopardizing correctness by implementing this themselves.

(Recommended priority: low. It can be worked around via query; I only have a few examples of users doing this themselves so far. Filing this as a place to collect +1's, but suggest waiting for evidence of interested users before taking it on.)

@aiuto aiuto added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request labels May 31, 2019
@Globegitter
Copy link

Globegitter commented May 31, 2019

We quite quickly had to switch away from the pure remote cache approach to something what you are suggesting here and our monorepo is far from the biggest one. And we have also run into the challenges that you mention, like handling changes of external repositories and BUILD files correctly. I think it would help a lot of people. Some related discussions: #7962 as well as https://groups.google.com/forum/m/#!msg/bazel-discuss/I9udqWIcEdI/iczVgWLOBQAJ

Edit: Also for reference, we based our ci script on https://github.com/bazelbuild/bazel/blob/master/scripts/ci/ci.sh but it has been heavily modified since.

@Globegitter
Copy link

Oh yeah in our ci we use the query result twice, once to know what to build/test and once to know what to deploy/run. So we take the result from the initial query and then filter for rules with the deployable tag. So it would be nice if the --prune-file argument can be used for query or if the list of targets can somehow else be retrieved after testing. Maybe with aquery and the --skyframe_state flag?

@jin jin added the team-Bazel General Bazel product/strategy issues label Jul 22, 2019
@sgowroji sgowroji added the stale Issues or PRs that are stale (no activity for 30 days) label Feb 15, 2023
@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

@bcsgh
Copy link

bcsgh commented Feb 15, 2023

Please either re-open this or include a link to the feature added since that implements this functionality.

That is: what flag do I use to override what files bazel sees as being changed?

@sgowroji sgowroji added not stale Issues or PRs that are inactive but not considered stale and removed stale Issues or PRs that are stale (no activity for 30 days) labels Feb 16, 2023
@sgowroji sgowroji reopened this Feb 16, 2023
@meisterT
Copy link
Member

I am closing this as not planned - I think https://github.com/Tinder/bazel-diff is a viable alternative that doesn't require changing Bazel.

@meisterT meisterT closed this as not planned Won't fix, can't repro, duplicate, stale Apr 19, 2023
@bcsgh
Copy link

bcsgh commented Apr 20, 2023

Bazel-diff only seems to be a solution if:

  • The code in question uses Git.
  • The points in time between which changes are of interest are (or can simply be made into) commits.
  • Building a list of targets and allowing Bazel to deduce the set of actions gets the desired effect.
  • A dependency on Kotlin (or a pre-built binary) is allowable

Granted at least three of these are likely the most common cases, but none of them are necessarily universal.


I'd kinda rather see "not planed" FR's be tagged as such, demoted to last priority and left open. IMHO Closed should be for done and won't do at all and not things where a sufficiently compelling uses case could revive them.

@sgowroji sgowroji removed the not stale Issues or PRs that are inactive but not considered stale label Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: feature request
Projects
None yet
Development

No branches or pull requests

7 participants