Description
Proposal: make fuzzing a first class citizen
There were many successful attempts to use AFL with Rust programs, see e.g. @frewsxcv's afl.rs. We can go one step further and make guided fuzzing a common way to test Rust code.
Instrumentation
Guided fuzzing requires code instrumentation so that the fuzzing engines get feedback from the code execution and can guide a) mutations and b) corpus expansion. Since Rust is based on LLVM, there is such instrumentation available already:
https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards (control flow feedback)
https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow (data flow feedback)
We may need to make some LLVM flags available via the Rust frontend, that's it.
Fuzzing engines
This LLVM instrumentation is already supported by at least AFL, libFuzzer and honggfuzz. We expect more engines to appear in near future and it's important to keep them plug compatible. This way using a new engine on a vast body of code will be trivial.
And by fuzzing engine we should understand a wider class of tools, including e.g. concolic execution tools.
Interface
In C/C++ we use the following interface for things that need fuzzing (we call these things fuzz targets):
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
DoSomethingInterestingWithMyAPI(Data, Size);
return 0; // Non-zero return values are reserved for future use.
}
At least as a start I propose to have something similar in (safe) Rust.
The interface should not depend on any particular fuzzing engine -- engines should be interchangeable.
The interface should also allow for both in-process and out-of-process engines.
In-process fuzzing
With some rare exceptions fuzzing needs to be in-process, i.e. the code under test and the fuzzing engine should co-exist in the same process. This typically speeds up fuzzing by 1-2 orders of magnitude and often makes deployment simpler. libFuzzer is in-process, AFL has in-process (aka persistent) mode, same for hoggfuzz
Structured fuzzing
A frequent question about fuzzing is how to fuzz the input consisting of several chunks of data, or even tree-like data structures. One of the possible answers is fuzzing protobufs. This can be discussed later and separately from this proposal.
Fuzzing for equivalence
One interesting special case is fuzzing for equivalence (e.g. to compare Rust and C implementations of the same thing). See my recent write up. This might be especially interesting for projects that re-implement commonly used C libraries, such as https://github.com/briansmith/ring.
Continuous fuzzing
Fuzzing is often useful as a one-off effort, but it really shines if used continuously.
One of the services that provide infrastructure for continuous fuzzing is https://github.com/google/oss-fuzz and we'd like to see important Rust projects there.
See also
- It should be possible to combine fuzzing with sanitizers ([WIP] Sanitizers support. #31605) to catch bugs in unsafe rust and to catch leaks in safe Rust.
- A similar discussion in Go: proposal: cmd/go: make fuzzing a first class citizen, like tests or benchmarks golang/go#19109