Description
In src/policy/marksweepspace/native_ms/global.rs
:
fn trace_object<Q: ObjectQueue>(
&self,
queue: &mut Q,
object: ObjectReference,
) -> ObjectReference {
// ...
if !VM::VMObjectModel::LOCAL_MARK_BIT_SPEC.is_marked::<VM>(object, Ordering::SeqCst) {
VM::VMObjectModel::LOCAL_MARK_BIT_SPEC.mark::<VM>(object, Ordering::SeqCst);
let block = Block::containing(object);
block.set_state(BlockState::Marked);
queue.enqueue(object);
}
object
}
The marking is non-atomic. If two GC workers attempt to mark the same object, it will be marked twice, and will be enqueued twice. It has two problems.
Firstly, enqueuing twice will cause the same object to be scanned twice, resulting in more work packets to be generated to do duplicated jobs.
Secondly, some VM binding make assumptions that each live object is scanned exactly once during full-heap GC (or at most once during nursery GC). Such behavior is usually related to the behaviors of the existing GCs of the VM. One example is CRuby. It attempts to clean up some data inside objects during the mark phase of a GC. When using MMTk binding, it does so when scanning an object. But when using MMTk with the MarkSweep plan and multiple GC workers, two GC workers will attempt to clean up the same object. The data structure being cleaned is not thread-safe, and it will crash the VM.
I think it is a reasonable to assume that a GC enqueues each object and scans each object at most once, unless there is a compelling reason not to do so. MarkSweep is the only plan that marks object non-atomically, and I think it is a mistake.
Related discussions: https://mmtk.zulipchat.com/#narrow/channel/313365-mmtk-ruby/topic/MMTk.20and.20marking.20id_table.20entries.2E