-
Notifications
You must be signed in to change notification settings - Fork 76
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
874 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
# Introduction to Local Allocations | ||
|
||
|
||
Instead of allocating values normally on the GC heap, local | ||
allocations allow you to stack-allocate values using the new `local_` | ||
keyword: | ||
|
||
let local_ x = { foo; bar } in | ||
... | ||
|
||
or equivalently, by putting the keyword on the expression itself: | ||
|
||
let x = local_ { foo; bar } in | ||
... | ||
|
||
To enable this feature, you need to pass the `-extension local` flag | ||
to the compiler. Without this flag, `local_` is not recognized as a | ||
keyword, and no local allocations will be performed. | ||
|
||
These values live on a separate stack, and are popped off at the end | ||
of the _region_. Generally, the region ends when the surrounding | ||
function returns, although read [the reference](local-reference.md) for more | ||
details. | ||
|
||
This helps performance in a couple of ways: first, the same few hot | ||
cachelines are constantly reused, so the cache footprint is lower than | ||
usual. More importantly, local allocations will never trigger a GC, | ||
and so they're safe to use in low-latency code that must currently be | ||
zero-alloc. | ||
|
||
However, for this to be safe, local allocations must genuinely be | ||
local. Since the memory they occupy is reused quickly, we must ensure | ||
that no dangling references to them escape. This is checked by the | ||
typechecker, and you'll see new error messages if local values leak: | ||
|
||
# let local_ thing = { foo; bar } in | ||
some_global := thing;; | ||
^^^^^ | ||
Error: This value escapes its region | ||
|
||
|
||
Most of the types of allocation that OCaml does can be locally | ||
allocated: tuples, records, variants, closures, boxed numbers, | ||
etc. Local allocations are also possible from C stubs, although this | ||
requires code changes to use the new `caml_alloc_local` instead of | ||
`caml_alloc`. A few types of allocation cannot be locally allocated, | ||
though, including first-class modules, classes and objects, and | ||
exceptions. The contents of mutable fields (inside `ref`s, `array`s | ||
and mutable record fields) also cannot be locally allocated. | ||
|
||
|
||
## Local parameters | ||
|
||
Generally, OCaml functions can do whatever they like with their | ||
arguments: use them, return them, capture them in closures or store | ||
them in globals, etc. This is a problem when trying to pass around | ||
locally-allocated values, since we need to guarantee they do not | ||
escape. | ||
|
||
The remedy is that we allow the `local_` keyword to also appear on function parameters: | ||
|
||
let f (local_ x) = ... | ||
|
||
A local parameter is a promise by a function not to let a particular | ||
argument escape its region. In the body of f, you'll get a type error | ||
if x escapes, but when calling f you can freely pass local values as | ||
the argument. This promise is visible in the type of f: | ||
|
||
val f : local_ 'a -> ... | ||
|
||
The function f may be equally be called with locally-allocated or | ||
GC-heap values: the `local_` annotation places obligations only on the | ||
definition of f, not its uses. | ||
|
||
Even if you're not interested in performance benefits, local | ||
parameters are a useful new tool for structuring APIs. For instance, | ||
consider a function that accepts a callback, to which it passes some | ||
mutable value: | ||
|
||
let uses_callback ~f = | ||
let tbl = Foo.Table.create () in | ||
fill_table tbl; | ||
let result = f tbl in | ||
add_table_to_global_registry tbl; | ||
result | ||
|
||
Part of the contract of `uses_callback` is that it expects `f` not to | ||
capture its argument: unexpected results could ensue if `f` stored a | ||
reference to this table somewhere, and it was later used and modified | ||
after it was added to the global registry. Using `local_` | ||
annotations allows this constraint to be made explicit and checked at | ||
compile time, by giving `uses_callback` the signature: | ||
|
||
val uses_callback : f:(local_ int Foo.Table.t -> 'a) -> 'a | ||
|
||
|
||
## Inference | ||
|
||
The examples above use the local_ keyword to mark local | ||
allocations. In fact, this is not necessary, and the compiler will | ||
use local allocations by default where possible, as long as the | ||
`-extension local` flag is enabled. | ||
|
||
The only effect of the keyword on e.g. a let binding is to change the | ||
behavior for escaping values: if the bound value looks like it escapes | ||
and therefore cannot be locally allocated, then without the keyword | ||
the compiler will allocate this value on the GC heap as usual, while | ||
with the keyword it will instead report an error. | ||
|
||
Inference can even determine whether parameters are local, which is | ||
useful for helper functions. It's less useful for toplevel functions, | ||
though, as whether their parameters are local is generally forced by | ||
their signature in the mli file, where no inference is performed. | ||
|
||
Inference does not work across files: if you want e.g. to pass a local | ||
argument to a function in another module, you'll need to explicitly | ||
mark the local parameter in the other module's mli. | ||
|
||
|
||
|
||
|
||
## More control | ||
|
||
There are a number of other features that allow more precise control | ||
over which values are locally allocated, including: | ||
|
||
- **Local closures**: | ||
|
||
``` | ||
let local_ f a b c = ... | ||
``` | ||
defines a function `f` whose closure is itself locally allocated. | ||
- **Local-returning functions** | ||
``` | ||
let f a b c = local_ | ||
... | ||
``` | ||
defines a function `f` which returns local allocations into its | ||
caller's region. | ||
- **Global fields** | ||
``` | ||
type 'a t = { global_ g : 'a } | ||
``` | ||
defines a record type `t` whose `g` field is always known to be on | ||
the GC heap (and may therfore freely escape regions), even though | ||
the record itself may be locally allocated. | ||
For more details, read [the reference](./local-reference.md). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
# Some Pitfalls of Local Allocations | ||
|
||
This document outlines some common pitfalls that may come up when | ||
trying out local allocations in a new codebase, as well as some | ||
suggested workarounds. Over time, this list may grow (as experience | ||
discovers new things that go wrong) or shrink (as we deploy new | ||
compiler versions that ameliorate some issues). | ||
|
||
|
||
## Tail calls | ||
|
||
Many OCaml functions just happen to end in a tail call, even those | ||
that are not intentionally tail-recursive. To preserve the | ||
constant-space property of tail calls, the compiler applies special | ||
rules around local allocations in tail calls (see [the | ||
reference](./local-reference.md)). | ||
|
||
If this causes a problem for calls that just happen to be in tail | ||
position, the easiest workaround is to prevent them from being | ||
treated as tail calls by moving them, replacing: | ||
|
||
func arg1 arg2 | ||
|
||
with | ||
|
||
let res = func arg1 arg2 in res | ||
|
||
With this version, local values used in `fun arg1 arg2` will be freed | ||
after `func` returns. | ||
|
||
## Partial applications with local parameters | ||
|
||
To enable the use of local allocations with higher-order functions, a | ||
necessary step is to add local annotations to function types, | ||
particularly those of higher-order functions. For instance, an `iter` | ||
function may become: | ||
|
||
val iter : 'a list -> f:local_ ('a -> unit) -> unit | ||
|
||
thus allowing locally-allocated closures `f` to be used. | ||
|
||
However, this is unfortunately not an entirely backwards-compatible | ||
change. The problem is that partial applications of `iter` functions | ||
with the new type are themselves locally allocated, because they close | ||
over the possibly-local `f`. This means in particular that partial | ||
applications will no longer be accepted as module-level definitions: | ||
|
||
let print_each_foo = iter ~f:(print_foo) | ||
|
||
The fix in these cases is to expand the partial application to a full | ||
application by introducing extra arguments: | ||
|
||
let print_each_foo x = iter ~f:(print_foo) x | ||
|
||
## Typing of (@@) and (|>) | ||
|
||
The typechecking of (@@) and (|>) changed slightly with the local | ||
allocations typechecker, in order to allow them to work with both | ||
local and nonlocal arguments. The major difference is that: | ||
|
||
f x @@ y | ||
y |> f x | ||
f x y | ||
|
||
are now all typechecked in exactly the same way. Previously, the | ||
first two were typechecked differently, as an application of an | ||
operator to the expressions `f x` and `y`, rather than a single | ||
application with two arguments. | ||
|
||
This affects which expressions are in "argument position", which can | ||
have a subtle effect on when optional arguments are given their | ||
default values. If this affects you (which is extremely rare), you | ||
will see type errors involving optional parameters, and you can | ||
restore the old behaviour by removing the use of `(@@)` or `(|>)` and | ||
parenthesizing their subexpressions. That is, the old typing behaviour | ||
of `f x @@ y` is available as: | ||
|
||
(f x) y |
Oops, something went wrong.