[libfuzzer] update Efficient Fuzzer Guide and small fixes to document…

…ation. R=aizatsky@chromium.org, inferno@chromium.org, ochang@chromium.org BUG=595751 Review URL: https://codereview.chromium.org/1855373008 Cr-Commit-Position: refs/heads/master@{#385857}
marcosholgado · Apr 7, 2016 · 27ea9c2 · 27ea9c2
1 parent 00daa0b
commit 27ea9c2
Show file tree

Hide file tree

Showing 4 changed files with 47 additions and 15 deletions.
diff --git a/testing/libfuzzer/README.md b/testing/libfuzzer/README.md
@@ -14,7 +14,7 @@ libFuzzer is an in-process coverage-driven evolutionary fuzzer. It helps
 engineers to uncover potential security & stability problems earlier.
 
 *** note
-**Requirements:** libFuzzer in chrome is supported with GN on Linux only. 
+**Requirements:** libFuzzer in Chrome is supported with GN on Linux only. 
 ***
 
 ## Integration Status

diff --git a/testing/libfuzzer/clusterfuzz.md b/testing/libfuzzer/clusterfuzz.md
@@ -10,7 +10,7 @@ executes libFuzzer tests on scale.
 
 ## Status Links
 
-* [Buildbot] - status of all libFuzzer builds
+* [Buildbot] - status of all libFuzzer builds.
 * [ClusterFuzz Fuzzer Status] - fuzzing metrics, links to crashes and coverage 
 reports.
 * [Corpus GCS Bucket] - current corpus for each fuzzer. Can be used to upload
@@ -20,7 +20,7 @@ bootstrapped corpus.
 
 The integration between libFuzzer and ClusterFuzz consists of:
 
-* Build rules definition in [fuzzer_test.gni]
+* Build rules definition in [fuzzer_test.gni].
 * [Buildbot] that automatically discovers fuzzers using `gn refs` facility, 
 builds fuzzers with multiple sanitizers and uploads binaries to a special
 GCS bucket. Build bot recipe is defined in [chromium_libfuzzer.py].
@@ -30,7 +30,7 @@ corpus is minimized to reduce number of duplicates and/or reduce effect of
 parasitic coverage. 
 * [ClusterFuzz Fuzzer Status] displays fuzzer runtime 
 metrics as well as provides links to crashes and coverage reports. The information
-is collected once a day.
+is collected every 30 minutes.
 
 
 [Buildbot]: https://goto.google.com/libfuzzer-clusterfuzz-buildbot

diff --git a/testing/libfuzzer/efficient_fuzzer.md b/testing/libfuzzer/efficient_fuzzer.md
@@ -13,9 +13,9 @@ Corpus is usually maintained between multiple fuzzer runs.
 
 There are several metrics you should look at to determine your fuzzer effectiveness:
 
-* fuzzer speed (exec/s)
-* corpus size
-* coverage
+* [fuzzer speed](#Fuzzer-Speed) (exec/s)
+* [corpus size](#Corpus-Size)
+* [coverage](#Coverage)
 
 You can collect these metrics manually or take them from [ClusterFuzz status]
 pages.
@@ -32,6 +32,7 @@ Because libFuzzer performs randomized search, it is critical to have it as fast
 as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer
 using any standard tool to see where it spends its time.
 
+
 ### Initialization/Cleanup
 
 Try to keep your fuzzing function as simple as possible. Prefer to use static
@@ -41,11 +42,39 @@ every single run.
 Fuzzers don't have to shutdown gracefully (we either kill them or they crash
 because sanitizer has found a problem). You can skip freeing static resource.
 
-Of course all resources allocated withing `LLVMFuzzerTestOneInput` function
+Of course all resources allocated within `LLVMFuzzerTestOneInput` function
 should be deallocated since this function is called millions of times during
 one fuzzing session.
 
 
+### Memory Usage
+
+Avoid allocation of dynamic memory wherever possible. Instrumentation works
+faster for stack-based and static objects than for heap allocated ones.
+
+It is always a good idea to play with different versions of a fuzzer to find the
+fastest implementation.
+
+
+### Maximum Testcase Length
+
+Experiment with different values of `-max_len` parameter. This parameter often
+significantly affects execution speed, but not always.
+
+1) Define which `-max_len` value is reasonable for your target. For example, it
+may be useless to fuzz an image decoder with too small value of testcase length.
+
+2) Increase the value defined on previous step. Check its influence on execution
+speed of fuzzer. If speed doesn't drop significantly for long inputs, it is fine
+to have some bigger value for `-max_len`.
+
+In general, bigger `-max_len` value gives better coverage. Coverage is main
+priority for fuzzing. However, low execution speed may result in waste of
+resources used for fuzzing. If large inputs make fuzzer too slow you have to
+adjust value of `-max_len` and find a trade-off between coverage and execution
+speed.
+
+
 ## Corpus Size
 
 After running for a while the fuzzer would reach a plateau and won't discover
@@ -58,7 +87,7 @@ magic numbers etc. The easiest way to diagnose this problem is to generate a
 [coverage report](#Coverage). To fix the issue you can:
 
 * change the code (e.g. disable crc checks while fuzzing)
-* prepare [corpus seed](#Corpus-Seed).
+* prepare [corpus seed](#Corpus-Seed)
 * prepare [fuzzer dictionary](#Fuzzer-Dictionary)
 
 ## Coverage
@@ -76,7 +105,7 @@ option variable if your are using another sanitizer (e.g. `MSAN_OPTIONS`).
 `sancov_path` can be omitted by adding llvm bin directory to `PATH` environment
 variable.
 
-## Corpus Seed
+### Corpus Seed
 
 You can pass a corpus directory to a fuzzer that you run manually:
 
@@ -87,12 +116,15 @@ You can pass a corpus directory to a fuzzer that you run manually:
 The directory can initially be empty. The fuzzer would store all the interesting
 items it finds in the directory. You can help the fuzzer by "seeding" the corpus:
 simply copy interesting inputs for your function to the corpus directory before
-running. This works especially well for file-parsing functionality: just
-use some valid files from your test suite.
+running. This works especially well for strictly defined file formats or data
+transmission protocols.
+* For file-parsing functionality just use some valid files from your test suite.
+* For protocol processing targets put raw streams from test suite into separate
+files.
 
 After discovering new and interesting items, [upload corpus to ClusterFuzz].
 
-## Fuzzer Dictionary
+### Fuzzer Dictionary
 
 It is very useful to provide fuzzer a set of common words/values that you expect
 to find in the input. This greatly improves efficiency of finding new units and

diff --git a/testing/libfuzzer/getting_started.md b/testing/libfuzzer/getting_started.md
@@ -1,7 +1,7 @@
 # Getting Started with libFuzzer in Chrome
 
 *** note
-**Prerequisites:** libFuzzer in chrome is supported with GN on Linux only. 
+**Prerequisites:** libFuzzer in Chrome is supported with GN on Linux only. 
 ***
 
 This document will walk you through:
@@ -70,7 +70,7 @@ Build with ninja as usual and run:
 
 ```bash
 ninja -C out/libfuzzer url_parse_fuzzer
-./out/libfuzzer url_parse_fuzzer
+./out/libfuzzer/url_parse_fuzzer
 ```
 
 Your fuzzer should produce output like this: