Description
TL;DR: This issue addresses some recent discussion about write barriers and omitting the slot
and target
parameters of the write barrier API functions.
We have some previous discussion about generalizing the subsuming barrier API for tagged references and atomic RMW (including CAS). This issue does not discuss subsuming barrier in depth, except acknowledging that the most general form of subsuming barrier sucks.
The most general write barrier API sucks!
The write barrier that is ultimately general, w.r.t. the field representation (pointer, compressed pointer, offsetted pointer, tagged pointer, handle, etc.), the operation (store, compare-and-swap, atomic exchange, etc.), whether the object is multi-copy (like Sapphire where write barriers write to both the old copy and the new copy), whether non-reference fields need barriers (like Sapphire), and the kind of barrier (object-logging barrier, field-logging barrier, SATB barrier, XOR zone barrier, generational barrier, etc.), is a subsuming barrier that lets the VM binding implement the actual write operation, has multiple object
fields, an optional slot
field, an optional and it can be very complicated.
fn object_reference_write(mutator: Mutator,
/// The object
object: ObjectReference,
/// For GC like Sapphire, the "new copy" or "old copy" of the current object
mirrored_object: Option<ObjectReference>,
/// Only used by field-logging barriers (LXR). Other barriers just let `operation` do the actual write.
slot_addr: Option<Address>,
/// The old target in the slot, or None if the slot was holding NULL, None, nil, false, true, nothing, missing, undef, small integer, etc.
old_target: Option<ObjectReference>,
/// The target object, or None if storing NULL, None, nil, false, true, nothing, missing, undef, small integer, etc.
new_target: Option<ObjectReference>,
/// A routine provided by VM binding to do the actual write/swap/CAS.
/// Return the actual old target if different from the `old_target` argument.
/// This can happen in CAS.
operation: FnOnce() -> Option<ObjectReference>);
(p.s. Ask @wks for an example of SATB barrier for AtomicReference.compareAndExchangeAcquire
in OpenJDK, or figure it out by yourself)
An API like this should be able to handle plans like GenImmix (ObjectBarrier), LXR (FieldBarrier), CMS (SatbBarrier), Sapphire (multi-write barrier), G1 (XOR zone barrier), @wenyuzhao's hypothetical alternative generational barrier design (gerational barrier), etc. and handle VMs like OpenJDK (needs atomic swap and CAS), CRuby (needs tagged reference), V8 (needs tagged reference and multiple flavors of NULL values), etc.
But an API like this will surely scare away 9 out of 10 PhD students or even professors in the field of language/VM implementation, not to mention developers who has "absolutely no idea how to write a programming language".
What's worse, if a VM wants to be fully general, it will need to apply such a subsuming barrier for every non-reference field write, too, just in case the current plan is Sapphire. But that'll slow down all programs, perhaps too slow even for debug builds.
What should we do?
Be practical. Provide a few flavors of pre-post barriers.
MMTk currently only has the ObjectBarrier in the master branch, and it has the field-logging barrier in the lxr branch. Considering common SATB barriers, advancing/retreating barriers for concurrent MS, I think a few kinds of barrier API functions will be sufficient to cover all barriers we currently have, and should be general enough for additional kinds of barriers.
Barrier forms
fn object_reference_write_pre_o(mutator: Mutator, object: ObjectReference);
fn object_reference_write_post_o(mutator: Mutator, object: ObjectReference);
fn object_reference_write_pre_ot(mutator: Mutator, object: ObjectReference, old_target: Option<ObjectReference>);
fn object_reference_write_post_ot(mutator: Mutator, object: ObjectReference, new_target: Option<ObjectReference>);
fn object_reference_write_pre_os(mutator: Mutator, object: ObjectReference, slot_addr: Address);
fn object_reference_write_post_os(mutator: Mutator, object: ObjectReference, slot_addr: Address);
fn object_reference_write_pre_ost(mutator: Mutator, object: ObjectReference, slot_addr: Address, old_target: Option<ObjectReference>);
fn object_reference_write_post_ost(mutator: Mutator, object: ObjectReference, slot_addr: Address, new_target: Option<ObjectReference>);
The suffix o
, s
and t
means object
, field_addr
and target
, respectively. The pre
barriers only take old target, while post
barriers only take new targets. Target can be None
if it is a NULL, None, nil, nothing, missing, undef, true, false, small integer, symbol, etc.
- The
o
form can support ObjectBarrier. It only needs to log the object. - The
os
form can support field-logging barrier. It needs the address of the field in order to access side metadata. Note that it is not theSlot
type which is intended for updating an object graph edge using theSlot::store
method, and aSlot
may not necessarily be inside the MMTk heap (can be on the stack or inmalloc
memory). - The
ot
form can support barriers that need to access the target, such as the SATB barrier (enqueues the old target), the Dijistra/Steel-style grey mutator barriers (inspects/changes the color of the new target), the XOR zone barrier (computeold_target XOR new_target
), etc. - The
ost
form is the most general form.
Simplify the API by merging the forms
We can merge those into just two functions:
fn object_reference_write_pre_ost(mutator: Mutator, object: ObjectReference, slot_addr: Option<Address>, old_target: Option<ObjectReference>);
fn object_reference_write_post_ost(mutator: Mutator, object: ObjectReference, slot_addr: Option<Address>, new_target: Option<ObjectReference>);
That is, we use Option<T>
to make both the slot_addr
and the {old,new}_target
optional. We explicitly write into the documentation that if a VM is not able to provide either of those fields, or if the VM knows the barrier (such as ObjectBarrier) doesn't need the slot_addr
or the new_target
, it can just pass None
to them.
This is actually very similar to what we currently have. We currently have a non-optional slot: Slot
parameter, and that's probably the only thing that needs to be changed.
We can also create a InteriorPointer
type to make it even clearer by making slot_addr
an Option<InteriorPointer>
. It emphasizes that if it is Some(iptr)
, the iptr
must be in MMTk heap (i.e. not in malloc heap, not on the stack, and not NULL).
Barrier form and Barrier semantics
When using a barrier semantics that needs less information, the VM can invoke a form that provides more information, and it still works.
- For example, when using the ObjectBarrier, the VM can actually call
obj_reference_write_post_ost(m, obj, slot, target)
, and the ObjectBarrier simply ignoresslot
and thetarget
.
When using a barrier semantics that needs more information, but the VM is only able to provide less information, it may or may not work, depending on GC algorithms.
- For example, when using field-logging barrier, but the VM can only identify the
object
that is changed because the field is accessed by C extensions or the field is off-heap in malloc memory (It happens in CRuby), it will not be able to log the fields.- But if we are implementing LXR or other coalescing RC, we can fall back to object-remembering if we can't do field-remembering for a particular object (or field). It will end up remembering more fields, but it still works.
- For SATB barrier, if the VM only tells MMTk an
object
is modified, it can conservatively enqueue all children ofobject
. It is slower when executed, but it is still correct, and it won't even keep more objects alive than the "snapshot in the beginning".
What about...
What about atomic swap and CAS?
This is a bit complicated. Due to concurrent access, we only know the actual "old value" after we do the swap or CAS. So if we do this:
let old_target = field.load();
let old_target2 = field.swap(new_target);
Then old_target
may be different from old_target2
.
It remains a question whether the write barrier actually needs the precise old_target2
at all. For SATB barrier, it doesn't because we only need to record the snapshot at the beginning. So only the oldest target matters. But if we do naive RC (it sucks anyway), we'll need the precise old_target2
to do the decrement.
What about Sapphire?
Those forms don't cover the need to apply barriers for non-reference fields, or the need to write to two copies of the same object. It's not on our agenda.
What about other subsuming barriers?
They are discussed in #1038. The main idea is, making a general API for multiple operations (store, swap, CAS, with acquire/release/seqcst orders), multiple field layout (fat, tagged, ofsetted, compressed, handles, etc.) and non-reference values (tagged integers, true, false, multiple NULL flavors like nil, nothing, missing, undef, etc.) can be very complicated.