|
| 1 | +Jane Street `ocamlformat` instructions |
| 2 | +====================================== |
| 3 | + |
| 4 | +This file describes the shape of the `ocamlformat` repo, with a |
| 5 | +specific eye toward making it easy to update `ocamlformat` to deal |
| 6 | +with Jane Street extensions. |
| 7 | + |
| 8 | +Overall structure |
| 9 | +----------------- |
| 10 | + |
| 11 | +The main implementation of `ocamlformat` lives in the `lib` directory. |
| 12 | +This is where the `ocamlformat` executable is driven. |
| 13 | + |
| 14 | +The most important file for our purposes here is `lib/Fmt_ast.ml`, which |
| 15 | +defines the functions that actually print the formatted AST. Other modules |
| 16 | +of interest in `lib/`: |
| 17 | + |
| 18 | +* `Fmt.ml` has combinators useful for printing. You're likely to use this |
| 19 | + module but not super likely to change it. |
| 20 | + |
| 21 | +* `Ast.ml` has various helpers, mainly for (a) constructing contexts, and (b) |
| 22 | + analyzing whether terms need parentheses. It's common to need to change the |
| 23 | + latter for new syntax. |
| 24 | + |
| 25 | +* `Cmts.ml` deals with comments (not the `cmt` file format!). Comments are not |
| 26 | + places in the extended AST, but rather maintained in a table that is checked |
| 27 | + at various points using the helpers in this module. |
| 28 | + |
| 29 | +* `Sugar.ml` has various bits of support code for sugaring/desugaring syntax |
| 30 | + (e.g., "multi-argument" functions). |
| 31 | + |
| 32 | +`ocamlformat` also includes *four* copies of the OCaml parser, all in |
| 33 | +`vendor`: |
| 34 | + |
| 35 | +* `parser-upstream` is just a copy of the upstream parser. It does not get |
| 36 | + built in this repo. Instead, it lives here only so that it can be diffed |
| 37 | + with `parser-standard`. This diff is automatically produced in |
| 38 | + `vendor/diff-parsers-upstream-std.patch`. The diff is used to monitor the |
| 39 | + drift between the standard parser and the upstream parser; the patch file |
| 40 | + has been mangled (it is produced by `tools/diff-ocaml` which uses `sed` to |
| 41 | + alter the output of `diff`) and thus cannot be applied by any known tool. |
| 42 | + |
| 43 | + You will not have to interact with `parser-upstream`. |
| 44 | + |
| 45 | +* `parser-standard` is meant to be very similar to the parser used in the |
| 46 | + OCaml compiler. Its role in `ocamlformat` is (only) to provide a safety-check: |
| 47 | + we want to make sure that the formatting does not change the AST as parsed by |
| 48 | + the compiler. So after formatting, the parsed AST is checked against the |
| 49 | + original parsed AST to make sure they are the same (modulo some normalization, |
| 50 | + as written in `lib/Normalize_std_ast.ml`). Key point: this parser and its |
| 51 | + representation are *not* pretty-printed and never interact with the |
| 52 | + pretty-printer. The *only* reason this is here is to mimic the behavior of the |
| 53 | + OCaml compiler. Accordingly, it should be as close to the OCaml compiler's |
| 54 | + parser as possible. |
| 55 | + |
| 56 | +* `parser-extended` uses an extended parsetree, capable of storing extra |
| 57 | + information useful in preserving the structure of the user's code. It is this |
| 58 | + extended parsetree that gets pretty-printed. The parser here forms part of the |
| 59 | + core implementation of `ocamlformat`. |
| 60 | + |
| 61 | +* `parser-recovery` is used for partial parsing, so that a bogus input source |
| 62 | + file does not get mangled. It was an experiment that has been discontinued by |
| 63 | + upstream and is not used with Jane Street. It uses the same parsetree as |
| 64 | + `parser-extended`. A patchfile tracking the changes between `parser-extended` |
| 65 | + and `parser-recovery` is generated. Just accept any changes that accrue there. |
| 66 | + |
| 67 | +The directory `vendor/ocaml-common` contains files that are shared between |
| 68 | +`parser-standard` and `parser-extended`, like `location.ml`. The `test` |
| 69 | +directory contains tests (see the Testing section below). |
| 70 | + |
| 71 | +Design considerations |
| 72 | +--------------------- |
| 73 | + |
| 74 | +Because the value of `parser-standard` is entirely in its ability to mimic the |
| 75 | +compiler's parser, we want to keep this parser as close as possible to the |
| 76 | +compiler's. We will want to copy over any changes made to the compiler's parser |
| 77 | +into this version of `ocamlformat`'s parser. |
| 78 | + |
| 79 | +On the other hand, the `parser-extended` can go off in its own direction: its |
| 80 | +parsetree should be designed to make pretty-printing easy. In addition, we want |
| 81 | +to make sure that incorporating upstream changes is easy. We thus feel free to |
| 82 | +edit the parsetree in `parser-extended`, but we do so by adding new |
| 83 | +constructors, not modifying existing ones. In addition, new code should be |
| 84 | +marked off with comments like `(* Jane Street extension *)` and `(* End Jane |
| 85 | +Street extension *)`. Because of the ability to extend the parsetree here, we do |
| 86 | +*not* use jane-syntax in `parser-extended`. |
| 87 | + |
| 88 | +`ocamlformat` routinely checks, at various places in `Fmt_ast`, that the thing |
| 89 | +it's about to print is an exact subterm of a "context" we're within (some larger |
| 90 | +term that we're in the middle of printing). Make sure you've designed your |
| 91 | +additions to `parser-extended` so that, while printing, you'll always recur on |
| 92 | +exact structural subterms. If you, for example, print an attribute specially |
| 93 | +and then try to recur on a term without that attribute, you're in for a bad |
| 94 | +time. |
| 95 | + |
| 96 | +Before building |
| 97 | +--------------- |
| 98 | + |
| 99 | +You will need to install several libraries. This command may work: |
| 100 | + |
| 101 | +``` |
| 102 | +opam install menhir.20210419 fix ocp-indent bechamel-js alcotest campl-streams fpath either dune-build-info uuseg ocaml-version |
| 103 | +``` |
| 104 | + |
| 105 | +Building |
| 106 | +-------- |
| 107 | + |
| 108 | +To build, run `dune build`. |
| 109 | + |
| 110 | +How to update `ocamlformat` |
| 111 | +--------------------------- |
| 112 | + |
| 113 | +The base branch to work from is called `jane`. Create a branch off of `jane`. |
| 114 | + |
| 115 | +1. Take the patch you wish to support (i.e. some PR in `flambda-backend`). |
| 116 | + Apply any changes to the `ocaml/parsing` directory to the files in |
| 117 | + `vendor/parser-standard`. Remember: this "standard" parser should be as |
| 118 | + close as possible to the compiler's. |
| 119 | + |
| 120 | + Note that some files used by both parsers are stored in |
| 121 | + `vendor/ocaml-common` and may need to be updated. Further, when |
| 122 | + incorporating new support files from the compiler, consider whether than can |
| 123 | + be shared in that directory rather than copied into each of the parser |
| 124 | + directories. This is typically the case if the support module doesn't depend |
| 125 | + on the parsetree. |
| 126 | + |
| 127 | +2. Get `ocamlformat` compiled and passing the tests. If the patch to |
| 128 | + `flambda-backend` was backward compatible, then this should be |
| 129 | + straightforward. (If your changes affect files in `vendor/ocaml-common`, this |
| 130 | + might not be so easy. That's OK. Just move on to the next step.) |
| 131 | + |
| 132 | +3. Edit the parsetree in `vendor/parser-extended/parsetree.mli` to support your |
| 133 | + new syntax. Copy over any changes to the parser and lexer from the |
| 134 | + `flambda-backend` patch, updating the parser's semantic actions as necessary. |
| 135 | + |
| 136 | +4. Edit the pretty-printer in `lib/Fmt_ast.ml` to format your new syntax nicely. |
| 137 | + This may require changes to other `lib/` modules, such as `Ast.ml` and |
| 138 | + `Sugar.ml`. |
| 139 | + |
| 140 | +5. Make the minimal changes to `parser-recovery` in order to get `ocamlformat` |
| 141 | + to compile. We do not use this feature within Jane Street (and it will be |
| 142 | + removed when merging with upstream), and so we're just keeping it on life |
| 143 | + support. Expend no extra effort here! |
| 144 | + |
| 145 | +6. Add tests. Get them to pass. See the "Testing" section below. |
| 146 | + |
| 147 | +Testing |
| 148 | +------- |
| 149 | + |
| 150 | +To just run your built ocamlformat against a file manually, run |
| 151 | +`dune exec ocamlformat -- --enable-outside-detected-project path/to/your/file.ml`. |
| 152 | + |
| 153 | +If you want to see the parsed source ocamlformat produces internally, use |
| 154 | +`dune exec -- tools/printast/printast.exe path/to/your/file.ml`. |
| 155 | + |
| 156 | +Run the tests with `dune test`. There are two kinds of tests: |
| 157 | + |
| 158 | +1) Correctly formatted files, which ocamlformat is run on to check that there |
| 159 | + are no changes. We have historically mainly added these, but not for any |
| 160 | + particularly good reason. |
| 161 | +2) Incorrectly formatted files, for which the output of ocamlformat is checked |
| 162 | + against a reference. |
| 163 | + |
| 164 | +To add a test, you add one, two or three files depending on what kind of test it |
| 165 | +is: |
| 166 | + |
| 167 | +- (Always) Add `tests/passing/tests/foo.ml` (where foo is the name of your new |
| 168 | + test). This is the file ocamlformat will be run on. |
| 169 | +- (Optional) If your file is incorrectly formatted, write the correctly |
| 170 | + formatted version in `tests/passing/tests/foo.ml.ref`. |
| 171 | +- (Optional) If it is expected `ocamlformat` will print information to stderr |
| 172 | + when running your test (uncommon) write that output to |
| 173 | + `tests/passing/tests/foo.ml.err`. |
| 174 | + |
| 175 | +Now, run `dune test`. It will discover your new file and suggest edits to |
| 176 | +the generated `tests/passing/dune.inc` file to run your new tests. Run |
| 177 | +`dune promote` to update `dune.inc`. This will *not* accept your new tests -- it |
| 178 | +just allows you to run your new tests. |
| 179 | + |
| 180 | +Then, run `dune test` again to actually run your tests. You will see any changes |
| 181 | +necessary to make the tests pass. You can run `dune promote` to accept those |
| 182 | +changes. |
| 183 | + |
| 184 | +If you get some cryptic error output with a few lines of the `dune.inc` file, it |
| 185 | +is likely that ocamlformat has crashed (e.g. with a parser error) while looking |
| 186 | +at your test. The cryptic error output will mention the name of the test. Run |
| 187 | +ocamlformat manually on your test file to see the actual error. |
| 188 | + |
| 189 | +Validity checking |
| 190 | +----------------- |
| 191 | + |
| 192 | +The ocamlformat repo has (at least) two validity checks for repo health: |
| 193 | + |
| 194 | +* The ocamlformat sources themselves must be formatted. You can run this check |
| 195 | +with `make fmt` (which will also auto-format `dune-project`). To reformat files |
| 196 | +that are incorrect, run `dune build @fmt --auto-promote`. Running `make test` |
| 197 | +runs `make fmt` and `dune runtest` together. The CI will check both the |
| 198 | +formatting check and the ocamlformat tests (but will not update `dune-project`). |
| 199 | + |
| 200 | +* All commits must be signed off. This is easy. When you're done with your |
| 201 | +sequence of commits and it's all ready to merge, just run |
| 202 | +`git rebase <starting commit> --signoff`, where `<starting commit>` is the |
| 203 | +commit before any of your edits. You can often say something like `origin/jane` |
| 204 | +or `HEAD~4` or similar. |
0 commit comments