Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track -Cprofile-use and -Cprofile-sample-use value by file hash, not file path #100413

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Aug 11, 2022

Before, the path to PGO profile passed in -Cprofile-use and -Cprofile-sample-use was tracked by the filepath only. This meant that if the code was compiled twice in a row with the same path, the crate would not be recompiled, even if the profile content has changed in the meantime.

I'm not too excited about the used md-5 crate's API, but it was already used for hashing source files, so I decided to keep the same dependency. I'm not really sure what the for_crate_hash argument is for, should I take it into account here?

Fixes: #100397

r? @michaelwoerister

@Kobzol Kobzol force-pushed the profile-use-track-file-hash branch from f8e7566 to 0ed3178 Compare August 11, 2022 15:25
@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 11, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 11, 2022
@bors
Copy link
Contributor

bors commented Aug 15, 2022

☔ The latest upstream changes (presumably #100595) made this pull request unmergeable. Please resolve the merge conflicts.

@Kobzol Kobzol force-pushed the profile-use-track-file-hash branch from 0ed3178 to db05f78 Compare August 16, 2022 07:52
@rust-log-analyzer

This comment has been minimized.

@Kobzol Kobzol force-pushed the profile-use-track-file-hash branch from db05f78 to df62250 Compare August 16, 2022 08:00
Copy link
Member

@michaelwoerister michaelwoerister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I just discovered that my comments from a few days ago are still "pending". So, here goes:

let mut hasher = Md5::default();

let mut file = File::open(path)?;
let mut buffer = [0; 4096];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it actually makes a difference but sccache seems to use a 128KB buffer for "best performance": https://cs.github.com/mozilla/sccache/blob/2af14599a6c8c591ff5c40bf96e62c47efebec63/src/util.rs?q=128#L56-L57

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Higher block probably means worse L1 cache reuse, but 128 KiB should be fine. I'll change it to this size.

}

fn hash_file(path: &Path) -> std::io::Result<Output<Md5>> {
let mut hasher = Md5::default();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just using StableHasher?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, AFAIK StableHasher is optimized for frequent and small updates, while here we want to hash large chunks of bytes.

I originally wanted to use Blake3, but decided on MD5 to avoid adding a new dependency. I created a small benchmark, which hashed a 4GiB profile file with random data (primed in file cache, so hopefully without much I/O overhead) with 128 KiB block size:

MD5: 9200 - 10000ms
BLAKE3: 1099 - 1100ms
StableHasher: 1687 - 1744ms

BLAKE3 looks like a good choice, especially since it was designed for use-cases like hashing files. But the stable hasher also isn't half bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Btw I also tried swapping the hashing algorithm for source files for blake3, but it didn't really help. Well, most source files don't have 4 GiB :) ).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXH3 would be the ideal choice ;)

Do we already have a dependency on BLAKE3? I agree that it looks like a great choice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't, I would have to add it to rustc_session. It uses CC0 1.0/Apache 2.0 license.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, when we do PGO for LLVM in CI, all the files have 16 GiB, but after they are combined into a single file, it's just under 30 MiB. For rustc it's 80 MiB. So maybe large optimizations here are unnecessary and we can just use StableHasher to avoid a new dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the files have 16 GiB
😅

I'd just go with StableHasher for now, I think. We can always switch later.

_error_format: ErrorOutputType,
_for_crate_hash: bool,
) {
// Q1: Should we also hash the filepath itself?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three options:

  1. Don't hash the file path (it probably does not influence anything)
  2. Hash the file path but declare all HashedFilePath options [TRACKED_NO_CRATE_HASH] so we don't break reproducible builds.
  3. Hash the file path but remap it with --remap-path-prefix so we don't break reproducible builds (although that's probably annoying for build system maintainers).

@michaelwoerister
Copy link
Member

Overall, I'm not sure about the approach. Maybe HashedFilePath could cache the fingerprint upon construction?

Another way to do it would be to turn retrieving the paths into eval_always queries. E.g.

query pgo_use_path(_: ()) -> ContentHashedFilePath {
    eval_always
    desc { "path to .profdata file used by LLVM" }
}

#[derive(HashStable)]
struct ContentHashedFilePath {
    path: PathBuf,
    /// A fingerprint of the file contents that `path` points to.
    hash: Fingerprint,
}

The HashedFilePath struct used in Session could then mark accesses to the path as unsafe, so that it's unlikely that someone accidentally uses it instead of the query.

The advantage of that approach is that the query system will take care of caching the result of the expensive operation, and that changes to the contents of the .profdata file would not invalidate the whole session, just the codegen part. (Not sure how likely that is in practice though)

The downside is that it's quite a bit more complex.

@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

The advantage of that approach is that the query system will take care of caching the result of the expensive operation, and that changes to the contents of the .profdata file would not invalidate the whole session, just the codegen part. (Not sure how likely that is in practice though)

This could actually be quite useful, because I think that for local experimentation it's quite common to repeatedly recompile the code with different profiles. Now if the profile changes, it will recompile all dependencies, which can be quite time consuming. Would this invalidation also works for depedencies? In other words, if I implement this query and the profile changes, will it only do codegen for dependencies, without recompiling the rest of the dependencies?

@michaelwoerister
Copy link
Member

In other words, if I implement this query and the profile changes, will it only do codegen for dependencies, without recompiling the rest of the dependencies?

IIRC, Cargo does not compile dependencies incrementally because it makes the assumption that those change very infrequently. Under that assumption, the query approach would not stop dependencies from being recompiled completely. I'm not sure if the CARGO_INCREMENTAL env var has any influence on this.

You could check by running Cargo with --verbose and see if -Cincremental shows up in the rustc commandline for dependencies.

@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

You could check by running Cargo with --verbose and see if -Cincremental shows up in the rustc commandline for dependencies.

Indeed -C incremental is not there, so this benefit wouldn't work. Then I wonder if we should just use the simpler approach and use something like Option<Hash> to cache the hash inside the hashed file struct?

@michaelwoerister
Copy link
Member

Then I wonder if we should just use the simpler approach and use something like Option to cache the hash inside the hashed file struct?

Yes, let's do that. HashedFilePath::new() can do the hashing, right?

Maybe I'd rename HashedFilePath to ContentHashedFilePath, btw, just to make it clearer what's going on.

@Kobzol Kobzol force-pushed the profile-use-track-file-hash branch from df62250 to 827bc7d Compare August 19, 2022 10:40
@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

Yes, let's do that. HashedFilePath::new() can do the hashing, right?

Indeed, that's a good idea. Before if the file changed during the compilation, it could be hashed several times, each time with a different hash, which is bad.

I renamed the struct, switched to StableHasher and changed the code to compute the hash in ContentHashedFilePath::new().

@michaelwoerister
Copy link
Member

I just realized that the current implementation will cause the file to be hashed in non-incremental mode too :/ That doesn't seem optimal, right?

@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

Since the LLVM/rustc profiles from our CI, which are quite large in their raw form, result in a 30 - 80 MiB file after merge, and only the merged file then goes to PGO (the raw files always have to be merged first AFAIK), I'd say that it's not such a big deal and in most cases the hashing will be very fast.

But even then, I think that we should hash the profile even with non-incremental mode. Consider this:

$ CARGO_INCREMENTAL=0 RUSTFLAGS="-Cprofile-use=profile.prof" cargo build --release
# Change profile.prof
$ CARGO_INCREMENTAL=0 RUSTFLAGS="-Cprofile-use=profile.prof" cargo build --release
# In current rustc, the crate is not recompiled!

I think that we should invalidate the profile even in non-incremental mode, or not?

@michaelwoerister
Copy link
Member

Decisions about recompiling something or not are done by Cargo. The hash we are talking about does not influence anything in non-incremental mode, if I understand correctly. Cargo would need to recognize the .profdata file as an input to rustc. I'm not sure if it tries to parse out information from RUSTFLAGS.

@michaelwoerister
Copy link
Member

To elaborate, there's a two-level process involved here:

  1. Cargo looks at the filesystem timestamps of compilation inputs to decide whether to invoke rustc or not.
  2. rustc might use the incr. comp. cache to do less work, if it is invoked incrementally.

@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

Aha. Well, but this fact makes this whole PR obsolete :) Since Cargo won't even invoke rustc, the code will not get recompiled even if the profile changes. I tried it now with this PR and indeed Cargo doesn't even invoke rustc, so the code doesn't get recompiled when the profile file changes.

@michaelwoerister
Copy link
Member

I don't think the PR is obsolete -- we still need to deal with the cases where rustc does get invoked with -Cincremental so we don't re-use out-dated object files.
But yes, Cargo also needs to be made aware that the .profdata is an input to compilation. The Cargo folks would know more about how to do that cleanly.

@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 19, 2022

I discussed this on the Cargo zulip stream (https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo) and it seems that there's a way forward. Since this change is relatively self-contained, should we merge it now? Or do you want me to make the Cargo related changes in this PR too?

@michaelwoerister
Copy link
Member

I think the cargo related changes should go into a separate PR.

Regarding this PR: I'm still not really happy about hashing the file in non-incremental mode. I'll think about it some more over the weekend.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 7, 2022
…chaelwoerister

Track PGO profiles in depinfo

This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated.

This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo).

I tried it locally and it seems that the code is recompiled just with this change, and rust-lang#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes.

Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated.

CC `@michaelwoerister`

r? `@Eh2406`
@bors
Copy link
Contributor

bors commented Sep 8, 2022

☔ The latest upstream changes (presumably #101577) made this pull request unmergeable. Please resolve the merge conflicts.

@michaelwoerister
Copy link
Member

@Kobzol, if you are still interested in this: I think the best way forward is to handle this in the query system. That way we won't do any unnecessary work in non-incremental and in incremental mode, invalidation will be limited to just the LLVM part.

I imagine it to work like this:

  • Add queries for the profile-use and profile-sample-use paths. The query provider reads the path from the session (might be good to somehow poison the field in the session at that point, so it's not accessible anymore -- but this is a more general problem).
  • The result type of these queries holds the path but its HashStable implementation will hash the contents of the file (we have to be careful to not hash the file multiple times here).
  • we add a field for the PGO paths to the ModuleCodegen struct, so backends have to access the queries, which will add the appropriate dependency edges.

The rest would be handled by the existing incr. comp. infrastructure.
What do you think?

@Kobzol
Copy link
Contributor Author

Kobzol commented Oct 13, 2022

This issue is not so pressing for me now because the profile paths are tracked by Cargo, so it will rebuild the crates correctly if the profiles change.

That being said, it would be nice to also fix this on the rustc side. I don't have a lot of experience with the query system, but it sounds reasonable. I'll try it once I have more time.

@apiraino
Copy link
Contributor

Based on this comment I'll tentatively switch review status. Feel free to request a review with @rustbot ready, thanks!

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2022
@Dylan-DPC
Copy link
Member

@Kobzol any updates on this?

@michaelwoerister
Copy link
Member

I'll be looking at an issue with #[debugger_visualizer] that might be somewhat similar in nature. Maybe fixing that might introduce some infrastructure that could also be used here.

I still think that the issue this PR addresses should be fixed eventually and I don't mind leaving it open and assigned to me as a reminder. Unless of course @Kobzol wants to get rid of the open PR 🙂 Then we can migrate this to a bug report.

@Dylan-DPC
Copy link
Member

Hmm. In that case i'll mark it as blocked till that work is done

@Dylan-DPC Dylan-DPC added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 15, 2023
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 20, 2024
…ister

Track PGO profiles in depinfo

This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated.

This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo).

I tried it locally and it seems that the code is recompiled just with this change, and rust-lang/rust#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes.

Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated.

CC `@michaelwoerister`

r? `@Eh2406`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
…ister

Track PGO profiles in depinfo

This PR makes sure that PGO profiles (`-Cprofile-use` and `-Cprofile-sample-use`) are tracked in depinfo, so that when they change, the compilation session will be invalidated.

This approach was discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Tracking.20PGO.20profile.20files.20in.20cargo).

I tried it locally and it seems that the code is recompiled just with this change, and rust-lang/rust#100413 is not even needed. But it's possible that not everything required is recompiled, so we will probably want to land both changes.

Another approach to implement this could be to store the PGO profiles in `sess.parse_sess.file_depinfo` when the session is being created, but then the paths would have to be converted to a string and then to a symbol, which seemed unnecessarily complicated.

CC `@michaelwoerister`

r? `@Eh2406`
@michaelwoerister
Copy link
Member

I'm unassigning myself (see rust-lang/team#1565). I think the general outline in #100413 (comment) would still be a valid approach to solve the problem.

@michaelwoerister michaelwoerister removed their assignment Oct 7, 2024
@jieyouxu
Copy link
Member

r? incremental (in case whoever is more familiar with incremental have any advice)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: Blocked on something else such as an RFC or other implementation work. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PGO profile invalidation in -Cprofile-use
10 participants