-
Notifications
You must be signed in to change notification settings - Fork 99
/
good-practices.mld
346 lines (273 loc) · 14.2 KB
/
good-practices.mld
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
{%html: <div style="display: flex; justify-content:space-between"><div>%}{{!"ast-traversal"}< Traversing the AST}{%html: </div><div>%}{{!"examples"}Examples >}{%html: </div></div>%}
{0 Good Practices}
{1:resp_loc Respecting Locations}
Correctly dealing with location is essential to correctly generate OCaml code.
They are necessary for error reporting by the compiler, but more generally for
Merlin's features to work, such as displaying occurrences and jumping to
definition. When called, the driver is called with the [-check] and
[-check-locations] flags, [ppxlib] makes it is a requirement that locations follow
some rules in order to accept the rewriting, as it will check that some
invariants are respected.
{2 The Invariants}
The invariants are as follows:
- AST nodes are requested to be well-nested WRT locations
- the locations of "sibling" AST nodes should not overlap
This is required for Merlin to behave properly.
Indeed, for almost any query directed at Merlin, it will need to inspect the
context around the user's cursor to give an answer that makes sense. And the
only input it has to do that is the cursor’s position in the buffer.
The handling of most queries starts by traversing the AST, using the
locations of nodes to select the right branch. (1) is necessary to avoid
discarding subtrees too early, (2) is used to avoid Merlin making arbitrary
choices (if you ask for the type under the cursor, and there seems to be two
things under the cursor, Merlin will need to pick one).
{2 Guidelines for Writing Well-Behaved PPXs}
It's obviously not always (indeed rarely) possible to mint new locations
when manipulating the AST.
The intended way to deal with locations is this:
- AST nodes that exist in the source should keep their original location
- new nodes should be given a "ghost" location (i.e.,
[{ some_loc with loc_ghost = true }]) to indicate that the node doesn't
exist in the sources.
In particular, {{!Ppxlib.Location.none}[Location.none]} is never meant to be
used by PPX authors, where some location is always available (for instance,
derivers and extenders at least know the locations of their relevant node).
Both the new check and Merlin will happily traverse the ghost nodes as if they
didn't exist. Note: this comes into play when deciding which nodes are
"siblings," for instance, if your AST is:
{v
A (B1(C, D),
B2(X, Y))
v}
but [B2] has a ghost location, then [B1], [X] and [Y] are considered
siblings.
Additionally, there is an attribute [\[@merlin.hide\]] that you can add on
nodes to tell Merlin (and the check) to ignore this node and all of its
children. Some helpers for this are provided in {{!Ppxlib.Merlin_helpers}[Merlin_helpers]}.
{1:handling_errors Handling Errors}
In order to give a nice user experience when using a PPX, it is necessary that
the resulting parsetree is as complete as possible. Most IDE tools, such as
Merlin, rely on the AST for their features, such as displaying type, jumping to
definition, or showing the list of errors.
In order to achieve this, errors that happen during rewriting should be handled
in a way that do not prevent a meaningful AST to be passed to Merlin.
There are mainly two ways to report errors when writing a PPX.
- By embedding special extensions nodes, called "error nodes", inside the
generated code.
- By raising a sepcific exception, letting the ppxlib driver {{!page-driver.exception_handling}handle the error}.
Let us emphasize that, while exceptions can be practical to quickly fail with an
error, the embedding mechanism has many advantages. For instance, embedding
allows to report multiple errors, and to output the part of the code that could
be generated successfully.
{2 Embedding the Errors in the AST}
It is better to always return a valid AST, as complete as possible, but with
"error extension nodes" at every place where successful code generation was
impossible. Error extension nodes are special extension nodes
[[%ocaml.error "error_message"]] that can be embedded into a valid AST and are interpreted later
as errors, e.g., by the compiler or Merlin. As all extension nodes, they can be
put {{:https://ocaml.org/manual/extensionnodes.html}at many places in the AST}
to replace structure items, expressions, or patterns, for example.
So whenever you're in doubt whether to throw an exception or if to embed the error as
an error extension node when writing a PPX rewriter,
embed the error is the way to go! And whenever you're in doubt about where
exactly to embed the error inside the AST, a good ground rule is: as deep in
the AST as possible.
For instance, suppose a rewriter is supposed to define a new record type, but
there is an error in one field’s type generation. In order to have
the most complete AST as output, the rewriter can still define the type and all
of its fields, putting an extension node in place of the type of the faulty
field:
{[
type long_record = {
field_1: int;
field_2: [%ocaml.error "field_2 could not be implemented due to foo"];
}
]}
[ppxlib] provides a function in its API to create error extension nodes:
{{!Ppxlib.Location.error_extensionf}[error_extensionf]}. This function creates
an extension node, which then must be transformed in the right kind of node
using functions such as
{{!Ppxlib.Ast_builder.Default.pexp_extension}[pexp_extension]}.
{2 A Documented Example}
Let us give an example. We will define a deriver on types records, which
constructs a default value from a given type. For instance, the derivation on
the type [type t = { x:int; y: float; z: string}] would yield [let default_t =
{x= 0; y= 0.; z= ""}]. This deriver has two limitations:
{ol
{- It does not work on other types than records,}
{- It only works for records containing fields of type [string], [int], or [float].}
}
The rewriter should warn the user about these limitations with a good error
reporting. Let’s first look at the second point. Here is the function mapping
the fields from the type definition to a default expression.
{[
let create_record ~loc fields =
let declaration_to_instantiation (ld : label_declaration) =
let loc = ld.pld_loc in
let { pld_type; pld_name; _ } = ld in
let e =
match pld_type with
| { ptyp_desc = Ptyp_constr ({ txt = Lident "string"; _ }, []); _ } ->
pexp_constant ~loc (Pconst_string ("", loc, None))
| { ptyp_desc = Ptyp_constr ({ txt = Lident "int"; _ }, []); _ } ->
pexp_constant ~loc (Pconst_integer ("0", None))
| { ptyp_desc = Ptyp_constr ({ txt = Lident "float"; _ }, []); _ } ->
pexp_constant ~loc (Pconst_float ("0.", None))
| _ ->
pexp_extension ~loc
@@ Location.error_extensionf ~loc
"Default value can only be derived for int, float, and string."
in
({ txt = Lident pld_name.txt; loc }, e)
in
let l = List.map fields ~f:declaration_to_instantiation in
pexp_record ~loc l None
]}
When the record definition contains several fields with types other than [int],
[float], or [string], several error nodes are added in the AST. Moreover, the
location of the error nodes corresponds to the field record's definition.
This allows tools such as Merlin to report all errors at once, at the right
location, resulting in a better workflow than having to recompile every time an
error is corrected to see the next one.
The first limitation is that the deriver cannot work on non-record types.
However, we decided here to derive a default value, even in the case of
non-record types, so that it does not appear as undefined in the remaining of
the file. This impossible value consists of an error extension node.
{[
let generate_impl ~ctxt (_rec_flag, type_declarations) =
let loc = Expansion_context.Deriver.derived_item_loc ctxt in
List.map type_declarations ~f:(fun (td : type_declaration) ->
let e, name =
match td with
| { ptype_kind = Ptype_record fields; ptype_name; ptype_loc; _ } ->
(create_record ~loc:ptype_loc fields, ptype_name)
| { ptype_name; ptype_loc; _ } ->
( pexp_extension ~loc
@@ Location.error_extensionf ~loc:ptype_loc
"Cannot derive accessors for non record type %s"
ptype_name.txt,
ptype_name )
in
[
pstr_value ~loc Nonrecursive
[
{
pvb_pat = ppat_var ~loc { txt = "default_" ^ name.txt; loc };
pvb_expr = e;
pvb_attributes = [];
pvb_loc = loc;
};
];
])
|> List.concat
]}
{1:quoting Quoting}
Quoting is part of producing
{{:https://en.wikipedia.org/wiki/Hygienic_macro}hygienic} code. But before
talking about the solution, let's introduce the problem.
Say you are writing an extension rewriter, which takes an expression as payload, and would replace all identifiers [id] in the expression with a similar expression, but with a printing debug:
{[
let x = 0 in
let y = 2 in
[%debug x + 1, y + 2 ]
]}
would generate the following code:
{[
let x = 0 in
let y = 2 in
let debug = Printf.printf "%s = %d; " in
(debug "x" x ; x) + 1,
(debug "y" y ; y) + 2
]}
When executed, the code would print [x = 0; y = 2; ]. So far, so good. However, suppose now that instead of [x], the variable is named [debug]. The following seemingly equivalent code:
{[
let debug = 0 in
let y = 2 in
[%debug debug + 1, y + 2 ]
]}
would generate:
{[
let debug = 0 in
let y = 2 in
let debug = Printf.printf "%s = %d; " in
(debug "debug" debug ; debug) + 1,
(debug "y" y ; y) + 2
]}
which does not even type-check! The problem is that the payload is expected to
be evaluated in some environment where [debug] has some value and type, but the
rewriting modifies this environment and shadows the [debug] name.
"Quoting" is a mechanism to prevent this problem from happenning. In [ppxlib], it
is done through the {{!Ppxlib.Expansion_helpers.Quoter}[Expansion_helpers.Quoter]} module in several steps:
- First, create a quoter using the {{!Ppxlib.Expansion_helpers.Quoter.create}[create]} function:
{[
# open Expansion_helper ;;
#s let quoter = Quoter.create () ;;
val quoter : Quoter.t = <abstr>
]}
- Then, use {{!Ppxlib.Expansion_helpers.Quoter.quote}[Expansion_helpers.Quoter.quote]} to quote all the expressions that are given from the user, might rely on a context, and that you want "intact."
{[
# let quoted_part = Quoter.quote quoter part_to_quote ;;
val quoted_payload : expression =
]}
- Finally, call {{!Ppxlib.Expansion_helpers.Quoter.sanitize}[Expansion_helpers.Quoter.sanitize]} on the whole expression (with quoted parts).
{[
# let result = Expansion_helpers.Quoter.sanitize ~quoter rewritten_expression ;;
val result : expression =
...
]}
If the [debug] rewriter had been written using this method, the quoting would
have ensured that the payload is evaluated in the same context as the
extension node!
Here is an example on how to write a [debug] rewriter (with the limitation that the payload should not contain variable binding, but the code was left simple to illustrate quoting):
{[
# let rewrite expr =
(* Create a quoter *)
let quoter = Quoter.create () in
(* An AST mapper to log and replace variables with quoted ones *)
let replace_var =
object
(* See the chapter on AST traverse *)
inherit Ast_traverse.map as super
(* in case of expression *)
method! expression expr =
match expr.pexp_desc with
(* in case of identifier (not "+") *)
| Pexp_ident { txt = Lident var_name; loc }
when not (String.equal "+" var_name) ->
(* quote the var *)
let quoted_var = Quoter.quote quoter expr in
let name = Ast_builder.Default.estring ~loc var_name in
(* and rewrite the expression *)
[%expr
debug [%e name] [%e quoted_var];
[%e quoted_var]]
(* otherwise, continue inside recursively *)
| _ -> super#expression expr
end
in
let quoted_rewrite = replace_var#expression expr in
let loc = expr.pexp_loc in
(* Sanitize the whole thing *)
Quoter.sanitize quoter
[%expr
let debug = Printf.printf "%s = %d; " in
[%e quoted_rewrite]] ;;
val rewrite : expression -> expression = <fun>
]}
With {!Ppxlib}'s current quoting mechanism, the code given in that example would look like:
{[
# Format.printf "%a\n" Pprintast.expression @@ rewrite [%expr debug + 1, y + 2] ;;
let rec __1 = y
and __0 = debug in
let debug = Printf.printf "%s = %d; " in
(((debug "debug" __0; __0) + 1), ((debug "y" __1; __1) + 2))
- : unit = ()
]}
{1 Testing Your PPX}
This section is not yet written. You can refer to {{:https://tarides.com/blog/2019-05-09-an-introduction-to-ocaml-ppx-ecosystem#testing-your-ppx}this blog post} (notice that that blog post was written before `dune` introduced its cram test feature), or contribute to the [ppxlib] documentation by opening a pull request in the {{:https://github.com/ocaml-ppx/ppxlib/}repository}.
{1 Migrate From Other Preprocessing Systems}
This section is not yet written. You can contribute to the [ppxlib] documentation by opening a pull request in the {{:https://github.com/ocaml-ppx/ppxlib/}repository}.
{1 Other good practices}
There are many good practices or other way to use [ppxlib] that are not mentioned in this manual. For instance, (in very short), you should always try to fully qualify variable names that are generated into the code via a PPX.
if you want to add a section to this "good practices" manual, you can contribute to the [ppxlib] documentation by opening a pull request in the {{:https://github.com/ocaml-ppx/ppxlib/}repository}.
{%html: <div style="display: flex; justify-content:space-between"><div>%}{{!"ast-traversal"}< Traversing the AST}{%html: </div><div>%}{{!"examples"}Examples >}{%html: </div></div>%}