Add 'real code' section to tutorial, plus command line args docs (#525)

tedinski · web-flow · commit 3aed2b5b2c33 · 2021-09-28T15:36:36.000-05:00
* add 'real code' section to tutorial, plus command line arguments documentation

* fix typos
diff --git a/rmc-docs/src/SUMMARY.md b/rmc-docs/src/SUMMARY.md
@@ -3,15 +3,17 @@
 - [Getting started with RMC](./getting-started.md)
   - [Installation](./install-guide.md)
   - [Comparison with other tools](./tool-comparison.md)
-  - [RMC on a single file]()
+  - [RMC on a single file](./rmc-single-file.md)
   - [RMC on a crate]()
   - [Debugging failures]()
   - [Debugging non-termination]()
+  - [Debugging coverage]()
 
 - [RMC tutorial](./rmc-tutorial.md)
   - [First steps with RMC](./tutorial-first-steps.md)
   - [Failures that RMC can spot](./tutorial-kinds-of-failure.md)
   - [Loops, unwinding, and bounds](./tutorial-loops-unwinding.md)
+  - [Where to start on real code](./tutorial-real-code.md)
 
 - [RMC developer documentation]()
 
diff --git a/rmc-docs/src/getting-started.md b/rmc-docs/src/getting-started.md
@@ -1,16 +1,19 @@
 # Getting started with RMC
 
 RMC is a Rust verification tool based on _model checking_.
-With RMC, you can ensure that broad classes of problems are absent from your Rust code by writing _proof harnesses_, which are broadly similar to tests (especially property tests.)
+With RMC, you can ensure that broad classes of problems are absent from your Rust code by writing _proof harnesses_, which are broadly similar to tests (especially property tests).
 RMC is especially useful for verifying unsafe code in Rust, where many of the language's usual guarantees can no longer be checked by the compiler.
 But RMC is also useful for finding panics in safe Rust, and it can check user-defined assertions.
 
 ## Project Status
 
 RMC is currently in the initial development phase, and has not yet made an official release.
 It is under active development, but it does not yet support all Rust language features.
+(The [RMC dashboard](./dashboard.md) can help you understand our current progress.)
 If you encounter issues when using RMC we encourage you to [report them to us](https://github.com/model-checking/rmc/issues/new/choose).
 
+RMC usually syncs with the main branch of Rust every week, and so is generally up-to-date with the latest Rust language features.
+
 ## Getting started
 
 1. [Begin with the RMC installation guide.](./install-guide.md) Currently, this means checking out and building RMC.
diff --git a/rmc-docs/src/rmc-single-file.md b/rmc-docs/src/rmc-single-file.md
@@ -0,0 +1,60 @@
+# RMC on a single file
+
+For small examples, or initial learning, it's very common to run RMC on just one source file.
+The command line format for invoking RMC directly has a few common formats:
+
+```
+rmc filename.rs
+# or
+rmc filename.rs [--rmc-flags]
+# or
+rmc filename.rs [--rmc-flags] --cbmc-args [--cbmc-flags]
+```
+
+For example,
+
+```
+rmc filenames.rs --visualize --cbmc-args --object-bits 11 --unwind 15
+```
+
+## Common RMC arguments
+
+**`--visualize`** will generate a report in the local directory accessible through `report/html/index.html`.
+This report will shows coverage information, as well as give traces for each failure RMC finds.
+
+**`--function <name>`** RMC defaults to assuming the starting function is called `main`.
+You can change it to a different function with this argument.
+Note that to "find" the function given, it needs to be given the `#[no_mangle]` annotation.
+
+**`--gen-c`** will generate a C file that roughly corresponds to the input Rust file.
+This can sometimes be helpful when trying to debug a problem with RMC.
+
+**`--keep-temps`** will preserve generated files that RMC generates.
+In particular, this will include a `.json` file which is the "CBMC symbol table".
+This can be helpful in trying to diagnose bugs in RMC, and may sometimes be requested in RMC bug reports.
+
+## Common CBMC arguments
+
+RMC invokes CBMC to do the underlying solving.
+(CBMC is the "C Bounded Model Checker" but is actually a framework that supports model checking multiple languages.)
+CBMC arguments are sometimes necessary to get good results.
+
+To give arguments to CBMC, you pass `--cbmc-args` to RMC.
+This "switches modes" from RMC arguments to CBMC arguments.
+Everything else given on the command line will be assumed to be a CBMC argument, and so all RMC arguments should be provided before this flag.
+
+**`--unwind <n>`** Give a global upper bound on all loops.
+This can force termination when CBMC tries to unwind loops indefinitely.
+
+**`--object-bits <n>`** CBMC, by default, assumes there are only going to be 256 objects allocated on the heap in a single trace.
+This corresponds to a default of `--object-bits 8`.
+Rust programs often will use more than this, and so need to raise this limit.
+However, very large traces with many allocations often prove intractable to solve.
+If you run into this issue, a good first start is to raise the limit to 2048, i.e. `--object-bits 11`.
+
+**`--unwindset label_1:bound_1,label_2:bound_2,...`** Give specific unwinding bounds on specific loops.
+The labels for each loop can be discovered by running with the following CBMC flag:
+
+**`--show-loops`** Print the labels of each loop in the program.
+Useful for `--unwindset`.
+
diff --git a/rmc-docs/src/tutorial-loops-unwinding.md b/rmc-docs/src/tutorial-loops-unwinding.md
@@ -122,7 +122,7 @@ Then we can specify the bound for specific loops by name, from the command line:
 rmc src/lib.rs --cbmc-args --unwindset _RNvCs6JP7pnlEvdt_3lib17initialize_prefix.0:12
 ```
 
-The general format of the `--unwindset` option is: `label_1:bound_1,label_2:bound_1,...`.
+The general format of the `--unwindset` option is: `label_1:bound_1,label_2:bound_2,...`.
 The label is revealed by the output of `--show-loops` as we saw above.
 
 ## Summary
diff --git a/rmc-docs/src/tutorial-real-code.md b/rmc-docs/src/tutorial-real-code.md
@@ -0,0 +1,98 @@
+# Where to start on real code
+
+It can be daunting to find the right place to start writing proofs for a real-world project.
+This section will try to help you get over that hurdle.
+
+In general, you're trying to do three things:
+
+1. Find a place it'd be valuable to have a proof.
+2. Find a place it won't be too difficult to prove something, just to start.
+3. Figure out what a feasible longer-term goal might be.
+
+**By far the best strategy is to follow your testing.**
+Places where proof will be valuable are often places where you've written a lot of tests, because they're valuable there for the same reasons.
+Likewise, code structure changes to make functions more unit-testable will also make functions more amenable to proof.
+Often, by examining existing unit tests (and especially property tests), you can easily find a relatively self-contained function that's a good place to start.
+
+## Where is proof valuable?
+
+1. Where complicated things happen with untrusted user input.
+These are often the critical "entry points" into the code.
+These are also places where you probably want to try fuzz testing.
+
+2. Where `unsafe` is used extensively.
+These are often places where you'll already have concentrated a lot of tests.
+
+3. Where you have a complicated implementation that accomplishes a much simpler abstract problem.
+Ideal places for property testing, if you haven't tried that already.
+But the usual style of property tests you often write here (generate large random lists of operations, then compare between concrete and abstract model) won't be practical to directly port to model checking.
+
+4. Where normal testing "smells" intractable.
+
+## Where is it easier to start?
+
+1. Find crates or files with smaller lists of dependencies.
+Dependencies can sometimes blow up the tractability of proofs.
+This can usually be handled, but requires a lot more investment to make it happen, and so isn't a good place to start.
+
+2. Don't forget to consider starting with your dependencies.
+Sometimes the best place to start won't be your code, but in code you depend on.
+If it's used by more projects that just yours, it will be valuable to more people, too!
+
+3. Find well-tested code.
+Code structure changes to make code more unit-testable will make it more provable, too.
+
+Here are some things to avoid, when starting out:
+
+1. Lots of loops, or at least nested loops.
+As we saw in the last section, right now we often need to put upper bounds on these to make more limited claims.
+
+2. "Inductive data structures."
+These are data of unbounded size.
+(For example, linked lists and trees.)
+Much like needing to put bounds on loops, these can be hard to model since you needs to put bounds on their size, too.
+
+3. I/O code.
+RMC doesn't model I/O, so if you're depending on behavior like reading/writing to a file, you won't be able to prove anything.
+This is one obvious area where testability helps provability: often we separate I/O and "pure" computation into different functions, so we can unit test the latter.
+
+4. Deeper call graphs.
+Functions that call a lot of other functions can require more investment to make tractable.
+They may not be a good starting point.
+
+5. Significant global state.
+Rust tends to discourage this, but it still exists in some forms.
+
+
+## Your first proof
+
+A first proof will likely start in the following form:
+
+1. Nondeterministically initialize variables that will correspond to function inputs, with as few constraints as possible
+2. Call the function in question with these inputs.
+3. Don't (yet) assert any post-conditions.
+
+Running RMC on this simple starting point will help figure out:
+
+1. What unexpected constraints might be needed on your inputs to avoid "expected" failures.
+2. Whether you're over-constrained. Check the coverage report using `--visualize`. Ideally you'd see 100% coverage, and if not, it's usually because now you've over-constrained the inputs.
+3. Whether RMC will support all the Rust features involved.
+4. Whether you've started with a tractable problem.
+(If the problem is initially intractable, try `--unwind 1` and see if you can follow the techniques in the previous section to put a bound on the problem.)
+
+Once you've got something working, the next step is to prove more interesting properties than what RMC covers by default.
+You accomplish this by adding new assertions.
+These are not necessarily assertions just in your proof harness: consider also adding new assertions to the code being run.
+These count too!
+Even if a proof harness has no post-conditions being asserted directly, the assertions encountered along the way can be meaningful proof results by themselves.
+
+
+## Summary
+
+In this section:
+
+1. We got some advice on how to choose a higher-value starting point for a first proof.
+2. We got some advice on how to choose an easier starting point for a first proof.
+3. We got some advice on how to structure our first proof, at least initially.
+
+> TODO: This section is incomplete. We should probably add an example session of finding a proof to write for, perhaps, Firecracker.