Description
Owned refs
This is a proposal to introduce a distinction between ref
and owned ref
in order to control aliasing and make all of Nim play nice with deterministic destruction.
The proposal is essentially identical to what has been explored in the Ownership You Can Count On paper.
Owned pointers cannot be duplicated, they can only be moved so they are very much like C++'s
unique_ptr
. When an owned pointer disappears, the memory it refers to is deallocated. Unowned refs are reference counted. When the owned ref disappears it is checked that no dangling ref
exists; the reference count must be zero. The reference counting can be enabled with a new runtime switch --checkRefs:on|off
.
Nim's new
returns an owned ref, you can pass an owned ref to either an owned ref or to an unowned ref. owned ref
models the spanning tree of your graph structures and is a useful tool also helping Nim's readability. The creation of cycles is mostly prevented at compile-time.
Some examples:
type
Node = ref object
data: int
var x = Node(data: 3) # inferred to be an ``owned ref``
let dangling: Node = x # unowned ref
assert dangling.data == 3
x = Node(data: 4) # destroys x! But x has dangling refs --> abort.
We need to fix this by setting dangling
to nil
:
type
Node = ref object
data: int
var x = Node(data: 3) # inferred to be an ``owned ref``
let dangling: Node = x # unowned ref
assert dangling.data == 3
dangling = nil
# reassignment causes the memory of what ``x`` points to to be freed:
x = Node(data: 4)
# accessing 'dangling' here is invalid as it is nil.
# at scope exit the memory of what ``x`` points to is freed
The explicit assignment of dangling = nil
is only required if unowned refs outlive the owned ref
they point to. How often this comes up in practice remains to be seen.
Detecting the dangling refs at runtime is worse than detecting it at compile-time but it also gives different development pacings: We start with a very expressive, hopefully not overly annoying solution and then we can check a large subset of problems statically with a runtime fallback much like every programming language in existance deals with array index checking.
This is how a doubly linked list looks like under this new model:
type
Node*[T] = ref object
prev*: Node[T]
next*: owned Node[T]
value*: T
List*[T] = object
tail*: Node[T]
head*: owned Node[T]
proc append[T](list: var List[T]; elem: owned Node[T]) =
elem.next = nil
elem.prev = list.tail
if list.tail != nil:
assert(list.tail.next == nil)
list.tail.next = elem
list.tail = elem
if list.head == nil: list.head = elem
EDIT: Removed wrong proc delete
.
Nim has closures which are basically (functionPointer, environmentRef)
pairs. So owned
also needs to apply to closures. This is how callbacks can be done:
type
Label* = ref object of Widget
Button* = ref object of Widget
onclick*: seq[owned proc()] # when the button is deleted so are
# its onclick handlers.
proc clicked*(b: Button) =
for x in b.onclick: x()
proc onclick*(b: Button; handler: owned proc()) =
onclick.add handler
proc main =
var label = newLabel() # inferred to be 'owned'
var b = newButton() # inferred to be 'owned'
b.onclick proc() =
label.text = "button was clicked!"
createUI(label, b)
main
is transformed into something like:
proc main =
type
Env = ref object
label: owned Label
b: owned Button
var env: owned Env
env.label = newLabel()
env.b = newButton()
b.onclick proc(envParam: Env) =
envParam.label.text = "button was clicked!"
createUI(env.label, env.b)
This seems to work out without any problem if envParam
is an unowned ref.
Pros and Cons
This model has significant advantages:
- We can effectively use a shared memory heap, safely. Multi threading your code is much easier.
- Deallocation is deterministic and works with custom destructors.
- We can reason about aliasing, two owned refs cannot point to the same location and that's enforced at compile-time. We can even map
owned ref
to C'srestrict
'ed pointers. - The runtime costs are much lower than C++'s
shared_ptr
or Swift's reference counting. - The required runtime mechanisms map to weird, limited targets like webassembly or GPUs.
- Porting Nim code to take advantage of this alternative runtime amounts to adding the
owned
keyword to strategic places. The compiler's error messages will guide you. - Since it doesn't use tracing the runtime is independent of the involved heap sizes. Heaps of terabytes or kilobytes in size make no difference.
- Doubly linked lists, trees and most other graph structures are easily modeled and don't need a borrow checker or other parametrized type system extensions.
- Valgrind and the Clang memory sanitizers work out of the box with Nim as it doesn't use conservative stack tracing anymore.
And of course, disadvantages:
- Dangling unowned refs cause a program abort and are not detected statically.
- You need to port your code and add
owned
annotations. nil
as a possible value forref
stays with us as it is required to disarm dangling pointers.
Immutability
This RFC is not about immutability, but once we have a clear notion of ownership in Nim, it can be added rather easily. We can add an opt-in rule like "only the owner should be allowed to mutate the object".
Possible migration period
Your code can either use a switch like --newruntime
and needs to use owned
annotations or else
you keep using Nim like before. The standard library needs to be patched to work in both modes. owned
is ignored if --newruntime
is not active. We can also offer an --owned
switch that enables the owned checks but does use the old runtime.