Skip to content

Check if rerun-if-changed points to a directory. #8973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/cargo/core/compiler/fingerprint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1700,7 +1700,7 @@ where
let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
Entry::Occupied(o) => *o.get(),
Entry::Vacant(v) => {
let mtime = match paths::mtime(path) {
let mtime = match paths::mtime_recursive(path) {
Ok(mtime) => mtime,
Err(..) => return Some(StaleItem::MissingFile(path.to_path_buf())),
};
Expand Down
81 changes: 81 additions & 0 deletions src/cargo/util/paths.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,87 @@ pub fn mtime(path: &Path) -> CargoResult<FileTime> {
Ok(FileTime::from_last_modification_time(&meta))
}

/// Returns the maximum mtime of the given path, recursing into
/// subdirectories, and following symlinks.
pub fn mtime_recursive(path: &Path) -> CargoResult<FileTime> {
let meta = fs::metadata(path).chain_err(|| format!("failed to stat `{}`", path.display()))?;
if !meta.is_dir() {
return Ok(FileTime::from_last_modification_time(&meta));
}
let max_meta = walkdir::WalkDir::new(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually a little surprised we already had this dependency...

.follow_links(true)
.into_iter()
.filter_map(|e| match e {
Ok(e) => Some(e),
Err(e) => {
// Ignore errors while walking. If Cargo can't access it, the
// build script probably can't access it, either.
log::debug!("failed to determine mtime while walking directory: {}", e);
None
}
})
.filter_map(|e| {
if e.path_is_symlink() {
// Use the mtime of both the symlink and its target, to
// handle the case where the symlink is modified to a
// different target.
let sym_meta = match std::fs::symlink_metadata(e.path()) {
Ok(m) => m,
Err(err) => {
// I'm not sure when this is really possible (maybe a
// race with unlinking?). Regardless, if Cargo can't
// read it, the build script probably can't either.
log::debug!(
"failed to determine mtime while fetching symlink metdata of {}: {}",
e.path().display(),
err
);
return None;
}
};
let sym_mtime = FileTime::from_last_modification_time(&sym_meta);
// Walkdir follows symlinks.
match e.metadata() {
Ok(target_meta) => {
let target_mtime = FileTime::from_last_modification_time(&target_meta);
Some(sym_mtime.max(target_mtime))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit odd to me, if walkdir follows symlinks then how come we need to manually check the mtime of the symlink target? If the symlink points to a directory doesn't that mean we're doing the same thing as before where we're looking at the mtime of a directory which isn't too useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have:

somedir/
    foo.txt
    bar.txt
    thing -> foo.txt

If you run a build, it will use the max mtime of those 3 files. If you then modify thing to point to bar.txt, that will update the mtime of thing, and will trigger a rebuild. If this didn't look at both the symlink and the target, it would only read the mtime of foo.txt and bar.txt, and Cargo would fail to trigger a rebuild.

If the thing symlink pointed to a directory, yes it will read the mtime of the directory. But then, walkdir will traverse into that directory and scan all the contents. Also, reading the mtime of the directory itself is important for deleting files. Otherwise, Cargo wouldn't know that a file was deleted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I didn't actually realized until just before this that the mtime of directories was important. I was mostly wondering here why it dealt with both the target and the source since for file-based symlinks I think the symlink mtime itself should be appropriate, but even then I can see how it would matter if the file is located outside of the directory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, reading the mtime of the directory itself is important for deleting files. Otherwise, Cargo wouldn't know that a file was deleted.

This seems like an error-prone way to find out a file was deleted. I think instead Cargo should cache the list of files in the directory (recursively) and consider the directory changed if any file that is in that cached list doesn't exist (or a new file exists that isn't on the list), in addition to the file timestamping.

}
Err(err) => {
// Can't access the symlink target. If Cargo can't
// access it, the build script probably can't access
// it either.
log::debug!(
"failed to determine mtime of symlink target for {}: {}",
e.path().display(),
err
);
Some(sym_mtime)
}
}
} else {
let meta = match e.metadata() {
Ok(m) => m,
Err(err) => {
// I'm not sure when this is really possible (maybe a
// race with unlinking?). Regardless, if Cargo can't
// read it, the build script probably can't either.
log::debug!(
"failed to determine mtime while fetching metadata of {}: {}",
e.path().display(),
err
);
return None;
}
};
Some(FileTime::from_last_modification_time(&meta))
}
})
.max()
// or_else handles the case where there are no files in the directory.
.unwrap_or_else(|| FileTime::from_last_modification_time(&meta));
Ok(max_meta)
}

/// Record the current time on the filesystem (using the filesystem's clock)
/// using a file at the given directory. Returns the current time.
pub fn set_invocation_time(path: &Path) -> CargoResult<FileTime> {
Expand Down
17 changes: 7 additions & 10 deletions src/doc/src/reference/build-scripts.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,19 +258,16 @@ filesystem last-modified "mtime" timestamp to determine if the file has
changed. It compares against an internal cached timestamp of when the build
script last ran.

If the path points to a directory, it does *not* automatically traverse the
directory for changes. Only the mtime change of the directory itself is
considered (which corresponds to some types of changes within the directory,
depending on platform). To request a re-run on any changes within an entire
directory, print a line for the directory and separate lines for everything
inside it, recursively.
If the path points to a directory, it will scan the entire directory for
any modifications.

If the build script inherently does not need to re-run under any circumstance,
then emitting `cargo:rerun-if-changed=build.rs` is a simple way to prevent it
from being re-run. Cargo automatically handles whether or not the script
itself needs to be recompiled, and of course the script will be re-run after
it has been recompiled. Otherwise, specifying `build.rs` is redundant and
unnecessary.
from being re-run (otherwise, the default if no `rerun-if` instructions are
emitted is to scan the entire package directory for changes). Cargo
automatically handles whether or not the script itself needs to be recompiled,
and of course the script will be re-run after it has been recompiled.
Otherwise, specifying `build.rs` is redundant and unnecessary.

<a id="rerun-if-env-changed"></a>
#### `cargo:rerun-if-env-changed=NAME`
Expand Down
91 changes: 89 additions & 2 deletions tests/testsuite/build_script.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ use std::thread;
use cargo::util::paths::remove_dir_all;
use cargo_test_support::paths::CargoPathExt;
use cargo_test_support::registry::Package;
use cargo_test_support::{basic_manifest, cross_compile, project};
use cargo_test_support::{rustc_host, sleep_ms, slow_cpu_multiplier};
use cargo_test_support::{basic_manifest, cross_compile, is_coarse_mtime, project};
use cargo_test_support::{rustc_host, sleep_ms, slow_cpu_multiplier, symlink_supported};

#[cargo_test]
fn custom_build_script_failed() {
Expand Down Expand Up @@ -4071,3 +4071,90 @@ fn dev_dep_with_links() {
.build();
p.cargo("check --tests").run()
}

#[cargo_test]
fn rerun_if_directory() {
if !symlink_supported() {
return;
}

// rerun-if-changed of a directory should rerun if any file in the directory changes.
let p = project()
.file("Cargo.toml", &basic_manifest("foo", "0.1.0"))
.file("src/lib.rs", "")
.file(
"build.rs",
r#"
fn main() {
println!("cargo:rerun-if-changed=somedir");
}
"#,
)
.build();

let dirty = || {
p.cargo("check")
.with_stderr(
"[COMPILING] foo [..]\n\
[FINISHED] [..]",
)
.run();
};

let fresh = || {
p.cargo("check").with_stderr("[FINISHED] [..]").run();
};

// Start with a missing directory.
dirty();
// Because the directory doesn't exist, it will trigger a rebuild every time.
// https://github.com/rust-lang/cargo/issues/6003
dirty();

if is_coarse_mtime() {
sleep_ms(1000);
}

// Empty directory.
fs::create_dir(p.root().join("somedir")).unwrap();
dirty();
fresh();

if is_coarse_mtime() {
sleep_ms(1000);
}

// Add a file.
p.change_file("somedir/foo", "");
p.change_file("somedir/bar", "");
dirty();
fresh();

if is_coarse_mtime() {
sleep_ms(1000);
}

// Add a symlink.
p.symlink("foo", "somedir/link");
dirty();
fresh();

if is_coarse_mtime() {
sleep_ms(1000);
}

// Move the symlink.
fs::remove_file(p.root().join("somedir/link")).unwrap();
p.symlink("bar", "somedir/link");
dirty();
fresh();

if is_coarse_mtime() {
sleep_ms(1000);
}

// Remove a file.
fs::remove_file(p.root().join("somedir/foo")).unwrap();
dirty();
fresh();
}