Skip to content

Conversation

@weiznich
Copy link

During a dependency review we noticed that the libfuzzer crate includes various development scripts. These development scripts shouldn't be there as they might, at some point become problematic. As of now they prevent any downstream user from enabling the [bans.build.interpreted] option of cargo deny.

I opted for using an explicit include list instead of an exclude list to prevent these files from being included in the published packages to make sure that everything that's included is an conscious choice.

During a dependency review we noticed that the libfuzzer crate includes various development scripts. These development scripts shouldn't be there as they might, at some point become problematic. As of now they prevent any downstream user from enabling the `[bans.build.interpreted]` option of cargo deny.

I opted for using an explicit include list instead of an exclude list to prevent these files from being included in the published packages to make sure that everything that's included is an conscious choice.
@Manishearth
Copy link
Member

@weiznich
Copy link
Author

weiznich commented Dec 16, 2025

Would it help you to add an CI job that verifies that everything is included by executing cargo package --all-features?

To be transparent for the reason why I'm so keen to rather use include than exclude: I'm doing that dependency review thing now for a while and for crates that use exclude I notice that they regress relatively often by including another unwanted or even unexpected file. You prevent that by using include.

@Manishearth
Copy link
Member

Would it help you to add an CI job that verifies that everything is included by executing cargo package --all-features?

Not in this case, because I'm not sure about the cross platform build. I might just defer to a different reviewer for this.

To be transparent for the reason why I'm so keen to rather use include than exclude: I'm doing that dependency review thing now for a while and for crates that use exclude I notice that they regress relatively often by including another unwanted or even unexpected file. You prevent that by using include.

That's reasonable, but then it would be much nicer for the reviewers to list exactly what gets excluded in the PR body.

@weiznich
Copy link
Author

Not in this case, because I'm not sure about the cross platform build. I might just defer to a different reviewer for this.

I added some CI jobs to verify this for all platforms supported by github actions. I made sure that these jobs pass in my fork. I would also like to add that I maintain several sys crates with bundling support myself (pq-src, mysqlclient-src), I have created others (like gdal-src) and contributes to even more of them (e.g. openssl-src), so I'm generally aware of the problems there. That doesn't mean that I'm right here, that's just for context.

If you feel that the target selection is not sufficient I'm happy to add more targets. I think anything from openssl-src shouldn't be too hard to add.

On a conceptual level: The build script adds all cpp files in libfuzzer to the cc call. So these need to be there.

libfuzzer/build.rs

Lines 26 to 29 in 217dc97

.expect("listable source directory")
.map(|de| de.expect("file in directory").path())
.filter(|p| p.extension().map(|ext| ext == "cpp") == Some(true))
.collect::<Vec<_>>();

I also included the h files and the def files as they seemed to be required as well. That shows up as build error during local testing. With that out of the way there is still the possibility that libfuzzer does something unexpected on some random niche platform, I personally wouldn't expect that.

That's reasonable, but then it would be much nicer for the reviewers to list exactly what gets excluded in the PR body.

I'm sorry to not have done that in the first place. I've opened around 50 similar PR's today, so I went with the bare minimum to keep it somewhat manageable on my side. My expectation there was that most people know what's in their repos

The scripts reported by cargo deny are:

  • ci/script.sh
  • libfuzzer/build.sh
  • libfuzzer/scripts/unbalanced_allocs.py
  • update-libfuzzer.sh

For reference that's the output of a diff between what's in the package before and after the change. (Got the files via cargo package --list)

diff /tmp/old /tmp/new                                         
2,3d1
< .github/workflows/rust.yml
< .gitignore
12,13d9
< ci/script.sh
< libfuzzer/CMakeLists.txt
63d58
< libfuzzer/README.txt
65d59
< libfuzzer/build.sh
69,71d62
< libfuzzer/scripts/unbalanced_allocs.py
< libfuzzer/standalone/StandaloneFuzzTargetMain.c
< libfuzzer/tests/CMakeLists.txt
74d64
< rust-toolchain
76d65
< update-libfuzzer.sh

As far as I can tell this only includes unused files, althoug libfuzzer/README.txt might be relevant to include?

@Manishearth
Copy link
Member

I also included the h files and the def files as they seemed to be required as well. That shows up as build error during local testing. With that out of the way there is still the possibility that libfuzzer does something unexpected on some random niche platform, I personally wouldn't expect that.

Yeah, as far as I can tell the build isn't invoking any makefiles (there aren't any), and I can't see the cc crate calling .sh files unless they're some autoconf thing (which these aren't).

So that seems fine!

I'm sorry to not have done that in the first place. I've opened around 50 similar PR's today, so I went with the bare minimum to keep it somewhat manageable on my side. My expectation there was that most people know what's in their repos

Reasonable. On the receiving end it's a bunch of work if you maintain a lot of repositories, especially ones you're not always looking at, and this is stuff that IME is rather easy to accidentally break and not notice until people come complaining to you after a release.

As far as I can tell this only includes unused files, althoug libfuzzer/README.txt might be relevant to include?

Yes, please.

I think for this crate in particular I would prefer an explicit exclude, since it vendors another package that we do not maintain. If more things get included in the future I would actually like that to require a cargo deny user to come ask us to fix that, because it might end up in a different call than we made here.

(A way to automate that would be to run cargo deny on the cargo package output)

@weiznich
Copy link
Author

On the receiving end it's a bunch of work if you maintain a lot of repositories, especially ones you're not always looking at, and this is stuff that IME is rather easy to accidentally break and not notice until people come complaining to you after a release.

No worries, I always expected that some of the cases might need more discussion than others. So it's really fine to clarify that and find the best solution for an particular case. It's not like there is only one possible way to resolve this.

I think for this crate in particular I would prefer an explicit exclude

I pushed another update that switches to exclude with all (beside the README.txt) files listed in the diff above. I nevertheless would let the CI jobs there so that you can be a bit more sure everything works in the future?


As final semi-related note: Given that you are part of the relevant rust project teams: It might be good to have an official documented position about whether to include tests or not from from the published package. More general speaking it might also desirable to switch to in include based approach (with a reasonable default) in a future rust edition to make it explicit what's in the published crate. Obviously the default won't work for all crates, but it might be better than the current status. You just find too much things in published packages that shouldn't be there. (If necessary I can certainly provide examples for that)

@Manishearth
Copy link
Member

Manishearth commented Dec 16, 2025

Note: I don't actually think I'm a part of any of the relevant Rust project teams here.

It might be good to have an official documented position about whether to include tests or not from from the published package

Maybe, but honestly there are too many disparate pressures in opposite directions for there to be a good consensus on that. And were we to have that discussion I strongly suspect it would fall on the side of "include as much as you can".

Similarly for the include-based default, I think the exclude-based default is deliberate because by and large the cost of accidentally forgetting to include the right file is worse.

The cost of not including the right file can be broken builds (yes, even though cargo publish checks the build), since suddenly your CI'd configuration is not what is actually published. Seen this happen a bunch of times and it's always frustrating.

The cost of not excluding the right file can be

  • Licensing issues for the package (should be manually resolved anyway!)
  • Packages that are overly large (not usually a blocker, but fixable)
  • Issues with tools like cargo-deny. People using those tools can ask their dependencies to exclude problematic files

I've been in enough situations where I've had to ask upstream to exclude files for those reasons, but I think it does make sense that it's an explicit ask rather than an implicit thing managed by tooling because it has almsot never been a situation where I 100% know for sure what the maintainer's preference will be.

The core problem is that there isn't and won't ever really be a consensus on what constitutes a "problematic file". Large files are often problematic but sometimes necessary for tests. Same goes for test files under a different license. Additional dev scripts are usually unnecessary but some packagers like having them.

There's a lot of people packaging crates.io packages into other places (distros, or vendored sources), where they actually do want various things like tests and dev scripts. It's always a tradeoff, and I'm not really convinced there's a one size fits all answer there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants