Skip to content

tidy: add support for --extra-checks=auto: feature #143398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

lolbinarycat
Copy link
Contributor

@lolbinarycat lolbinarycat commented Jul 3, 2025

in preparation for #142924

also heavily refactored the parsing of the --extra-checks argument to warn about improper usage.

cc @GuillaumeGomez

r? @Kobzol

now does proper parsing of git's output and falls back to
assuming all files are modified if `git` doesn't work.

accepts a closure so extensions can be checked.
currently this just uses a very simple
extension-based heirustic.
@rustbot rustbot added A-tidy Area: The tidy tool S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Jul 3, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 3, 2025

This PR modifies src/bootstrap/src/core/config.

If appropriate, please update CONFIG_CHANGE_HISTORY in src/bootstrap/src/utils/change_tracker.rs.

There are changes to the tidy tool.

cc @jieyouxu

@Kobzol
Copy link
Member

Kobzol commented Jul 4, 2025

Thanks for working on this! To reiterate (writing this to remind myself), the use-case for this is that you want to be running some extra checks locally (which you can now configure through bootstrap.toml), but you don't want to run for them to execute if no relevant files were modified.

I'm not fully sold on the auto: prefix idea. The syntax for extra checks starts to be quite complicated, tbh.. 😆 I don't know if anyone ever has a use-case for "only check formatting of Python files when they change, but always lint Python files", which can be now said with auto:py:fmt, py:lint.

What do you think about simplifying the automatic detection syntax specification by just adding an extra option called e.g. "auto" or "auto-detect"? So that you would say "--extra-checks=auto-detect,py:lint,cpp:fmt, and auto-detect` would just enable the automatic detection for all extra checks (which is likely what people want for local usage anyway).

Btw, we already have a precedent in tidy for only formatting files that were modified locally (and this cannot really be configured). In theory, if we were able to make the fmt/lint Python/C++/Shell heuristic that detects which relevant files were changed 100% bulletproof, we could just always apply the extra checks only if any of these files were modified, without adding "auto".

That being said, I'm not sure that we can make the heuristic so solid... The problem that could occur is that if some relevant changed files are not detected locally, then tidy can be green locally, but red on CI (because CI checks everything, as we can't afford to just detect changed files there). That's an annoying situation. With your PR, people would at least have to opt into this behavior, it wouldn't be the default, unlike Rust file formatting.

}
if crate::files_modified(base_commit, |p| p == RUSTDOC_JSON_TYPES) {
// `rustdoc-json-types` was not modified so nothing more to check here.
println!("`rustdoc-json-types` was not modified.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be return; here? Also, shouldn't the condition be opposite, i.e. if !crate::files_modified?

eprintln!("error: failed to run `git diff` in rustdoc_json check");
return;
}
if crate::files_modified(base_commit, |p| p == RUSTDOC_JSON_TYPES) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the == be something like starts_with or contains?

.expect("bad format from `git diff --name-status`");
if status == "M" { Some(name) } else { None }
});
for modified_file in modified_files {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified_files.any(pred)

/// Returns true if any modified file matches the predicate, if we are in CI, or if unable to list modified files.
pub fn files_modified(ci_info: &CiInfo, pred: impl Fn(&str) -> bool) -> bool {
let Some(base_commit) = &ci_info.base_commit else {
eprintln!("No base commit, assuming all files are modified");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we panic here if we're on CI? This wuold be a serious issue if the commit was missing for some reason, we shouldn't just skip it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, let's just return true from files_modified if we're on CI. We should always check everything on CI.

@lolbinarycat
Copy link
Contributor Author

What do you think about simplifying the automatic detection syntax specification by just adding an extra option called e.g. "auto" or "auto-detect"?

that was the original idea, but we can't do that because shellcheck is not run in CI, and I don't wanna have auto just exclude shellcheck because reasons. Also if someone has only some of the linters it's nice to give them the option i suppose.

The parsing refactor is complicated, but it was particularly needed to provide actual input validation, and the auto: thing was the simplest thing that dealt with the shellcheck inconsistency without feeling like a weird hack that is completely unintuitive.

@Kobzol
Copy link
Member

Kobzol commented Jul 4, 2025

that was the original idea, but we can't do that because shellcheck is not run in CI, and I don't wanna have auto just exclude shellcheck because reasons. Also if someone has only some of the linters it's nice to give them the option i suppose.

I'm not sure if I understand, this should have nothing to do with CI or shellcheck. auto would never get used on CI, only locally, and if enabled, it would essentially behave exactly as if you automatically applied the auto: prefix to all passed extra checks. Nothing less, nothing more.

I'm just proposing to turn --extra-checks=auto:py:fmt,auto:py:lint,cpp into --extra-checks=auto,py:fmt,py:lint,cpp.

@lolbinarycat
Copy link
Contributor Author

I'm not sure if I understand, this should have nothing to do with CI or shellcheck.

the reason why it matters if someone sets a blanket auto and then modifies a single shell script, tidy will suddenly start spewing hundreds of shellcheck warnings. this seems undesirable.

saying "auto just ignores shell" feels like a footgun and is also very inelegant.

@Kobzol
Copy link
Member

Kobzol commented Jul 4, 2025

Auto would only modify the detection logic. If you don't specify shell in the extra checks, shellcheck won't be run. Saying just --extra-checks:auto wouldn't do anything.

let Some(mut first) = parts.next() else {
return Err(ExtraCheckParseError::Empty);
};
if first == "auto" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have an all keyword too? Could be convenient for CI to ensure we didn't miss any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While all would be a good idea, we don't currently run shellcheck on CI, so it wouldn't be used there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reason why we don't? Shells are currently a horror show?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to ignore my OCD and to NOT take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason is because previous attempts at adding shellcheck introduced a bunch of subtle bugs, there's a t-infra zulip thread where i asked about this.

@lolbinarycat
Copy link
Contributor Author

Auto would only modify the detection logic. If you don't specify shell in the extra checks, shellcheck won't be run. Saying just --extra-checks:auto wouldn't do anything.

I don't think that's actually simpler to implement tbh, and I think the argument against it exists in your own post: it adds new trivial cases that could be mistaken as meaning something else. it is also seems less intuitive and harder to explain in documentation, how it looks like other items but behaves differently, how it means nothing on it's own, etc...

it could maybe be simpler to implement if i completely rewrote my implementation from scratch, but i think in doing so i would need to give up some error reporting, so it would be a significant amount of work to redo things to get something slightly less robust.

@Kobzol
Copy link
Member

Kobzol commented Jul 5, 2025

I didn't necessarily mean simpler to implement, but having a simpler mental model for the functionality. Now it's in a sense too powerful, as I said before, likely no one will ever need auto:py:lint,py:fmt. Of course using --extra-checks=auto is indeed a hack, it should be a separate flag like --only-check-changed or something.

But I don't want to block this further. Please wait until #143452 is merged (perhaps in #143473), then rebase and r=me.

@lolbinarycat
Copy link
Contributor Author

Unfortunatly #134006 does something kinda odd with the extra check args that is incompatible with that model, so I think I need to rewrite that to use --bless instead?

I don't think there's any way to avoid having that be a breaking change, though.

@lolbinarycat
Copy link
Contributor Author

I think I'm gonna submit a seperate PR changing how that works, as that should really be done first to avoid messy issues.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 6, 2025
…, r=Kobzol

tidy: use --bless for tidy spellcheck instead of spellcheck:fix

previous behavior was inconsistent with existing extra checks.

unsure if this needs a change tracker entry or a warning for people who try to use the old behavior.

unsure if we should call this `spellcheck:lint` for consistency.

making this consistent is a prerequisite for rust-lang#143398

cc `@nnethercote`

r? `@Kobzol`
rust-timer added a commit that referenced this pull request Jul 6, 2025
Rollup merge of #143493 - lolbinarycat:tidy-spellcheck-bless, r=Kobzol

tidy: use --bless for tidy spellcheck instead of spellcheck:fix

previous behavior was inconsistent with existing extra checks.

unsure if this needs a change tracker entry or a warning for people who try to use the old behavior.

unsure if we should call this `spellcheck:lint` for consistency.

making this consistent is a prerequisite for #143398

cc `@nnethercote`

r? `@Kobzol`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tidy Area: The tidy tool S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants