Description
TL;DR:
- "Is reachable" and "is live" are wrong questions to ask during tracing, and what
ObjectReference::is_reachable()
really does is querying whether an object has been reached during tracing. We need better questions and better function names. - In StickyImmix,
object.is_reachable()
returnstrue
ifobject
is mature and we haven't reached it. This is wrong and should be fixed.
Problem
As I was writing the finalization and weak references chapter of the porting guide, I realized that the word "reachable" may not be meaningful during tracing.
We say an object is "reachable" if there is a path from roots to that object. We can even define "strongly/softly/weakly/phantomly reachable" if there are soft/weak/phantom references. But all those definitions of reachability are about the state of an object in an object graph. If an object is strongly reachable, it is always strongly reachable before, during or after a GC. The reachability only changes if the mutator mutates the object graph.
But during tracing, what we really care about is whether an object has been reached. MMTk-core processes weak references and finalizers in two ways
- Using the old API, we run the strong
Closure
, thenSoftRefClosure
, thenWeakRefClosure
, thenFinalRefClosure
, thenPhantomRefClosure
. - Using the new API, we run the strong
Closure
, then we runScanning::process_weak_ref
multiple time, each processing a different weak reference strength.
In either way, after a previous closure is finished, we ask "Is this object already reached in previous closures?", and we ask this by calling ObjectReference::is_reachable()
. Note that this question is different from "Is this object softly/weakly/phantomly reachable?".
- During tracing, the state of
object.is_reachable()
changes as we gradually trace through one reference after another, but - whether an object is softly/weakly/phantomly reachable only depends on the shape of the object graph, and MMTk only knows this after each level of transitive closure is complete.
The current method ObjectReference::is_reachable()
actually returns whether an object is reached. It answers the question we want to ask (i.e. whether the object is reached), but the method name is_reachable()
is wrong.
Similarly, ObjectReference::is_live()
is also wrong. The life and death of objects is not determined until the end of all tracing stages.
New names
ObjectReference::is_reachable()
should be renamed ObjectReference::is_reached()
.
- Scope: It can be called during tracing, i.e. from the
TPinningClosure
to theVMRefClosure
work buckets. - Semantics: It returns true if an object has been traced (i.e. reached by
trace_object
) in this GC since tracing started.
ObjectReference::is_live()
should be renamed ObjectReference::is_retained()
.
- Scope: Same as
is_reached
. - Semantics: It returns true if either
is_reached()
returns true, or the GC will retain this object for other reasons, for example,The object is in the immortal space.(No. It is not safe to keep a weak reference to an un-reached object even if it is in the immortal space. See below)- The object is in the mature space during a nursery GC.
Which to call when handling weak references, is_reached
or is_retained
?
It should be is_retained()
. Suppose there is an old object o1
, and a young WeakReference
named y1
which weakly references o1
. During a nursery GC, we will not trace o1
because it is in the mature space. This means o1
will not be reached during this nursery GC. However, because nursery GCs never reclaim old objects (i.e. they assume old objects are all alive), the weak reference y1
must not be cleared. In fact, a path from root to o1
may exist, going through multiple mature objects before reaching o1
. So o1
can be alive according to the object graph even though we don't trace it.
On the contrary, is_reached()
shall return false
if we didn't trace the old object.
However, the current implementation of ObjectReference::is_reachable()
simply calls SFT::is_reachable()
which in turn calls SFT::is_live()
. For StickyImmix SFT::is_live()
checks the mark bit which is always set for mature objects (hence the name "sticky"). So object.is_reachable()
will still return true if object
is a mature object, which is technically wrong! We should make the semantics clear and suggest not to use is_reachable
(or the renamed is_reached
) for weak reference processing.
How do we implement is_reached
for StickyImmix?
With sticky mark bits, StickyImmix cannot distinguish between (1) old objects retained in previous nursery GCs and (2) young objects just marked during the current GC. If we have to distinguish between them, we may need a different metadata bit. But that seems unnecessary.
If it is unnecessary to ask "whether an object is reached during the current GC", we may delete the is_reachable()
method, and keep is_alive()
(and rename it to is_retained()
).
Are objects in the immortal space always live (or retained)?
Not really. Although we never reclaim any space from the ImmortalSpace, we can't assume all objects in the ImmortalSpace are live.
Suppose the plan is SemiSpace and each GC moves every single object in the two copy spaces. If an object o1
in the ImmortalSpace has a reference to an object o2
in the from-space, then the field needs to be traced and forwarded so that it will continue to reference the same object in the to-space. This will only happen if o1
is reachable from root.
On the contrary, if o1
is not reachable from root, it will not be traced, and it will not be scanned, and its fields will not be forwarded. Then o1
will contain dangling pointers. If we keep a weak reference to o1
, mutators may convert it to a strong reference and see dangling pointers. This will crash!
So the current is_live()
for ImmortalSpace that always returns true
is not suitable for weak reference processing. The current implementation of is_reachable()
for ImmortalSpace, on the other hand, returns true only if the object is reached. That's correct w.r.t. weak reference processing.
But then is_retained()
will not be a perfect name because we never reclaim any objects in the immortal space. In this sense, all objects in the ImmortalSpace are "retained" (although not all objects in the ImmortalSpace are in good shape).