From a6f8629d0c954a259cf43f593d4aca125e299897 Mon Sep 17 00:00:00 2001
From: aizatsky <aizatsky@chromium.org>
Date: Thu, 17 Mar 2016 17:22:24 -0700
Subject: [PATCH] [libfuzzer] First part of libfuzzer-chrome documentation.

I expect to add 2 more documents in the follow up:

- clusterfuzz-libfuzzer integration documentation (build bots, corpus, status links, reports)
- reference (fuzzer_test reference, fuzzer options, dictionaries, etc.)

BUG=539572

Review URL: https://codereview.chromium.org/1809843002

Cr-Commit-Position: refs/heads/master@{#381847}
---
 testing/libfuzzer/README.md           |  32 +++++++
 testing/libfuzzer/clusterfuzz.md      |   7 ++
 testing/libfuzzer/efficient_fuzzer.md |  94 ++++++++++++++++++++
 testing/libfuzzer/getting_started.md  | 119 ++++++++++++++++++++++++++
 4 files changed, 252 insertions(+)
 create mode 100644 testing/libfuzzer/README.md
 create mode 100644 testing/libfuzzer/clusterfuzz.md
 create mode 100644 testing/libfuzzer/efficient_fuzzer.md
 create mode 100644 testing/libfuzzer/getting_started.md

diff --git a/testing/libfuzzer/README.md b/testing/libfuzzer/README.md
new file mode 100644
index 00000000000000..08638f3065bbe8
--- /dev/null
+++ b/testing/libfuzzer/README.md
@@ -0,0 +1,32 @@
+# Libfuzzer in Chrome
+
+[g.co/libfuzzer-chrome]
+
+This directory contains integration between [LibFuzzer] and Chrome.
+Libfuzzer is an in-process coverage-driven evolutionary fuzzer. It helps
+engineers to uncover potential security & stability problems earlier.
+
+*** note
+**Requirements:** libfuzzer in chrome is supported with GN on Linux only. 
+***
+
+## Integration Status
+
+Fuzzer tests are well-integrated with Chrome build system & distributed 
+ClusterFuzz fuzzing system. Cover bug: [crbug.com/539572].
+
+## Documentation
+
+* [Getting Started Guide] walks you through all the steps necessary to create
+your fuzzer and submit it to ClusterFuzz.
+* [Efficient Fuzzer Guide] explains how to measure fuzzer effectiveness and
+ways to improve it.
+* [ClusterFuzz Integration] describes integration between ClusterFuzz and 
+libfuzzer.
+
+
+[LibFuzzer]: http://llvm.org/docs/LibFuzzer.html
+[crbug.com/539572]: https://bugs.chromium.org/p/chromium/issues/detail?id=539572
+[Getting Started Guide]: ./getting_started.md
+[Efficient Fuzzer Guide]: ./efficient_fuzzer.md
+
diff --git a/testing/libfuzzer/clusterfuzz.md b/testing/libfuzzer/clusterfuzz.md
new file mode 100644
index 00000000000000..13a05eabe9a01f
--- /dev/null
+++ b/testing/libfuzzer/clusterfuzz.md
@@ -0,0 +1,7 @@
+# Libfuzzer and ClusterFuzz Integration
+
+
+
+## Fuzzer Status
+
+fuzzer status goes here.
diff --git a/testing/libfuzzer/efficient_fuzzer.md b/testing/libfuzzer/efficient_fuzzer.md
new file mode 100644
index 00000000000000..d72b752c1e751c
--- /dev/null
+++ b/testing/libfuzzer/efficient_fuzzer.md
@@ -0,0 +1,94 @@
+# Efficient Fuzzer
+
+This document describes ways to determine your fuzzer efficiency and ways 
+to improve it.
+
+## Overview
+
+Being a coverage-driven fuzzer, Libfuzzer considers a certain input *interesting*
+if it results in new coverage. The set of all interesting inputs is called 
+*corpus*. 
+Items in corpus are constantly mutated in search of new interesting input.
+Corpus is usually maintained between multiple fuzzer runs.
+
+There are several metrics you should look at to determine your fuzzer effectiveness:
+
+* fuzzer speed (exec/s)
+* corpus size
+* coverage
+
+You can collect these metrics manually or take them from [ClusterFuzz status]
+pages.
+
+## Fuzzer Speed
+
+Fuzzer speed is printed while fuzzer runs:
+
+```
+#19346  NEW    cov: 2815 bits: 1082 indir: 43 units: 150 exec/s: 19346 L: 62
+```
+
+Because Libfuzzer performs randomized search, it is critical to have it as fast
+as possible. You should try to get to at least 1,000 exec/s. Profile the fuzzer
+using any standard tool to see where it spends its time.
+
+### Initialization/Cleanup
+
+Try to keep your fuzzing function as simple as possible. Prefer to use static
+initialization and shared resources rather than bringing environment up and down
+every single run.
+
+Fuzzers don't have to shutdown gracefully (we either kill them or they crash
+because sanitizer has found a problem). You can skip freeing static resource.
+
+Of course all resources allocated withing `LLVMFuzzerTestOneInput` function
+should be deallocated since this function is called millions of times during
+one fuzzing session.
+
+
+## Corpus Size
+
+After running for a while the fuzzer would reach a plateau and won't discover
+new interesting input. Corpus for a reasonably complex functionality
+should contain hundreds (if not thousands) of items.
+
+Too small corpus size indicates some code barrier that
+libfuzzer is having problems penetrating. Common cases include: checksums,
+magic numbers etc. The easiest way to diagnose this problem is to generate a 
+[coverage report](#Coverage). To fix the issue you can:
+
+* change the code (e.g. disable crc checks while fuzzing)
+* prepare fuzzer dictionary
+* prepare [corpus seed](#Corpus-Seed).
+
+## Coverage
+
+You can easily generate source-level coverage report for a given corpus:
+
+```
+ASAN_OPTIONS=coverage=1:html_cov_report=1:sancov_path=./third_party/llvm-build/Release+Asserts/bin/sancov \
+  ./out/libfuzzer/my_fuzzer -runs=0 ~/tmp/my_fuzzer_corpus
+```
+
+This will produce an .html file with colored source-code. It can be used to
+determine where your fuzzer is "stuck".
+
+## Corpus Seed
+
+You can pass a corpus directory to a fuzzer that you run manually:
+
+```
+./out/libfuzzer/my_fuzzer ~/tmp/my_fuzzer_corpus
+```
+
+The directory can initially be empty. The fuzzer would store all the interesting
+items it finds in the directory. You can help the fuzzer by "seeding" the corpus:
+simply copy interesting inputs for your function to the corpus directory before
+running. This works especially well for file-parsing functionality: just
+use some valid files from your test suite.
+
+After discovering new and interesting items, [upload corpus to Clusterfuzz].
+
+
+[ClusterFuzz status]: ./clusterfuzz.md#Fuzzer-Status
+[upload corpus to Clusterfuzz]: ./clusterfuzz.md#Upload-Corpus
diff --git a/testing/libfuzzer/getting_started.md b/testing/libfuzzer/getting_started.md
new file mode 100644
index 00000000000000..95bdf75cf6b66e
--- /dev/null
+++ b/testing/libfuzzer/getting_started.md
@@ -0,0 +1,119 @@
+# Getting Started with Libfuzzer in Chrome
+
+*** note
+**Prerequisites:** libfuzzer in chrome is supported with GN on Linux only. 
+***
+
+This document will walk you through:
+
+* setting up your build enviroment.
+* creating your first fuzzer.
+* running the fuzzer and verifying its vitals.
+
+## Check Out ToT Clang
+
+Libfuzzer relies heavily on compile-time instrumentation. Because it is still
+under heavy development you need to use tot clang with libfuzzer:
+
+```bash
+# In chrome/src
+LLVM_FORCE_HEAD_REVISION=1 ./tools/clang/scripts/update.py --force-local-build --without-android
+```
+
+To revert this run the same script without specifying `LLVM_FORCE_HEAD_REVISION`.
+
+## Configure Build
+
+Use `use_libfuzzer` GN argument together with sanitizer to generate build files:
+
+```bash
+# With address sanitizer
+gn gen out/libfuzzer '--args=use_libfuzzer=true is_asan=true enable_nacl=false' --check
+```
+
+Supported sanitizer configurations are:
+
+| GN Argument | Description |
+|--------------|----|
+| is_asan=true | enables [Address Sanitizer] to catch problems like buffer overruns. |
+| is_msan=true | enables [Memory Sanitizer] to catch problems like uninitialed reads. |
+
+
+## Write Fuzzer Function
+
+Create a new .cc file and define a `LLVMFuzzerTestOneInput` function:
+
+```cpp
+extern "C" int LLVMFuzzerTestOneInput(const unsigned char *data, size_t size) {
+  // put your fuzzing code here and use data+size as input.
+  return 0;
+}
+```
+
+[url_parse_fuzzer.cc] is a simple example of real-world fuzzer.
+
+## Define GN Target
+
+Define `fuzzer_test` GN target:
+
+```
+import("//testing/libfuzzer/fuzzer_test.gni")
+fuzzer_test("my_fuzzer") {
+  sources = [ "my_fuzzer.cc" ]
+  deps = [ ... ]
+}
+```
+
+## Build and Run Fuzzer Locally
+
+Build with ninja as usual and run:
+
+```bash
+ninja -C out/libfuzzer url_parse_fuzzer
+./out/libfuzzer url_parse_fuzzer
+```
+
+Your fuzzer should produce output like this:
+
+```
+INFO: Seed: 1787335005
+INFO: -max_len is not provided, using 64
+INFO: PreferSmall: 1
+#0      READ   units: 1 exec/s: 0
+#1      INITED cov: 2361 bits: 95 indir: 29 units: 1 exec/s: 0
+#2      NEW    cov: 2710 bits: 359 indir: 36 units: 2 exec/s: 0 L: 64 MS: 0 
+#3      NEW    cov: 2715 bits: 371 indir: 37 units: 3 exec/s: 0 L: 64 MS: 1 ShuffleBytes-
+#5      NEW    cov: 2728 bits: 375 indir: 38 units: 4 exec/s: 0 L: 63 MS: 3 ShuffleBytes-ShuffleBytes-EraseByte-
+#6      NEW    cov: 2729 bits: 384 indir: 38 units: 5 exec/s: 0 L: 10 MS: 4 ShuffleBytes-ShuffleBytes-EraseByte-CrossOver-
+#7      NEW    cov: 2733 bits: 424 indir: 39 units: 6 exec/s: 0 L: 63 MS: 1 ShuffleBytes-
+#8      NEW    cov: 2733 bits: 426 indir: 39 units: 7 exec/s: 0 L: 63 MS: 2 ShuffleBytes-ChangeByte-
+#11     NEW    cov: 2733 bits: 447 indir: 39 units: 8 exec/s: 0 L: 33 MS: 5 ShuffleBytes-ChangeByte-ChangeASCIIInt-ChangeBit-CrossOver-
+#12     NEW    cov: 2733 bits: 451 indir: 39 units: 9 exec/s: 0 L: 62 MS: 1 CrossOver-
+#16     NEW    cov: 2733 bits: 454 indir: 39 units: 10 exec/s: 0 L: 61 MS: 5 CrossOver-ChangeBit-ChangeBit-EraseByte-ChangeBit-
+#18     NEW    cov: 2733 bits: 458 indir: 39 units: 11 exec/s: 0 L: 24 MS: 2 CrossOver-CrossOver-
+```
+
+The `... NEW ...` line appears when libfuzzer finds new and interesting input. The 
+efficient fuzzer should be able to finds lots of them rather quickly.
+
+The '... pulse ...' line will appear periodically to show the current status.
+
+
+## Submitting Fuzzer to ClusterFuzz
+
+ClusterFuzz builds and executes all `fuzzer_test` targets in the source tree.
+The only thing you should do is to submit a fuzzer into Chrome.
+
+## Next Steps
+
+* After your fuzzer is submitted, you should check its [ClusterFuzz status] in
+a day or two.
+* Check the [Efficient Fuzzer Guide] to better understand your fuzzer
+performance and for optimization hints.
+
+
+[Address Sanitizer]: http://clang.llvm.org/docs/AddressSanitizer.html
+[Memory Sanitizer]: http://clang.llvm.org/docs/MemorySanitizer.html
+[url_parser_fuzzer.cc]: https://code.google.com/p/chromium/codesearch#chromium/src/testing/libfuzzer/fuzzers/url_parse_fuzzer.cc
+[ClusterFuzz status]: ./clusterfuzz.md#Fuzzer-Status
+[Efficient Fuzzer Guide]: ./efficient_fuzzer.md