Description
As an example, consider Clone
for Rc<T>
. We can elide the refcount bump if the region checker can prove that the new Rc<T>
won't outlive the old one. This is somewhat like passing &Rc<T>
instead of Rc<T>
.
LLVM can probably catch some of these, but we can surely do more if Rc<T>
opts in to different semantics. I can imagine writing code like
impl<T> Clone for Rc<T> {
#[inline]
fn clone<'a>(&'a self) -> Rc<T> {
let new = Rc { _ptr: self._ptr, ... }
if can_outlive!(new, 'a) {
self.inc_strong();
}
new
}
}
Here can_outlive!()
is a deeply magical builtin, on the order of GCC's __builtin_types_compatible_p
or __builtin_constant_p
. Then suppose I do something like
fn f(x: Rc<uint>) { ... }
fn g() {
let x = Rc::new(0u);
f(x.clone());
}
and the clone
call gets inlined. The region checker will notice that the argument to f
can't outlive x
, and will arrange for that can_outlive!(new, 'a)
to act like a constant false
. (When it can't prove this it becomes true
, naturally.)
We'd also need to change Drop
for Rc<T>
, of course. You can't always statically pair up the clone
and drop
calls, so I imagine stealing a pointer tag bit to indicate to drop
whether it needs to dec_strong
. This makes it a dubious win for Rc<T>
, but it could help a lot with Arc<T>
where you're avoiding atomic memory operations.
I expect this to be particularly useful in generic code. For example the html5ever tree builder has a type parameter for "handle to node", and clones these handles all over the place. When instantiating the tree builder for a refcounted DOM, many of those clones could be elided.