Skip to content

Remove repr(C) from Mutator #1318

Open
Open
@wks

Description

@wks

The reason why the Mutator struct was made #[repr(C)] was to make it possible for bindings to embed Mutator into their own C/C++ structs in thread-local storage. This is supposed to improve performance by allowing inlined fast-path code to access important fields (such as BumpPointer) quickly from the TLS. However, embedding the entire Mutaotr is neither practical nor necessary.

It is impractical because

  • Because the Mutator struct is so big, it is beyond the addressable range of immediate addressing in some architectures, such as RISC-V. There is already a proposal to make Mutator smaller.
  • The C/C++ part of the VM binding has to precisely replicate the Mutator struct. This is tedious and error-prone, but both the OpenJDK and the Julia binding have done that. And the presence of Rust-specific data types, such as &dyn T, makes it hard to get the object layout right.
  • To make it repr(C), the Allocators sub-struct has to rely on fixed-size arrays [T; N]. By doing this, the number of allocators of each kind are limited and unknown at run time.
    • Changing the maximum number of each allocator will alter the layout of Mutator, resulting in a series of cascading changes in VM bindings that depend on the precise layout, such as this PR and its associated changes in the OpenJDK and Julia bindings.
    • Because the actual number of each kind of allocator is determined by the concrete Plan, the Allocators struct has to use MaybeUninit (like [MaybeUninit<BumpAllocator<VM>>; MAX_BUMP_ALLOCATORS]) so that array elements can remain uninitialized if the concrete plan doesn't need that many allocators of that kind. This is also the reason why the Allocators::get_allocator and Allocators::get_allocator_mut have to be made unsafe.

And it is not necessary to embed the entier Mutator into the TLS. Actually, the allocation fast path only needs the BumpPointer struct. The VM binding only needs to embed the BumpPointer struct. This practice is already documented in the Porting Guide, and there is an API function AllocatorInfo::new(selector) for getting the offset of the BumpPointer struct. The Ruby binding didn't embed the BumpPointer, but just lets the TLS keep a pointer to the BumpPointer inside the Mutator without depending on the concrete structure of Mutator, and the ART binding manually synchronizes the BumpPointer in the Mutator with the thread-local storage.

The proposal: Remove repr(C) from Mutator

We remove the #[repr(C)] annotation on Mutator. The C/C++ part of the VM binding can no longer assume the layout of Mutator. But we keep a bottom line that:

  1. For every data structure that needs to be accessed in fast paths (such as BumpPointer), we provide an API function for returning the pointer to (or offset of) that structure so that the VM binding can get the pointer without knowing the layout of Mutator. Currently, this can be achieved with either mutator_address + AllocatorInfo::new(selector).bump_pointer_offset or &mutator.allocator_impl<ConcreteAllocator>(selector).bump_pointer.
  2. We allow the related fast-path data structures (such as BumpPointer) to be taken out of the Mutator and put back later. This will allow the VM binding to keep the performance-sensitive parts in the TLS as long as possible because we assume they will be used by the inlined fast paths most of the time.

The Allocators struct

The struct Allocators no longer needs to be repr(C), either. It was repr(C) because it was part of Mutator.

Because it doesn't need to be repr(C), its layout no longer needs to be exposed to the VM binding. Now we can use Vec<T> instead of [T; N], for example,

-    pub bump_pointer: [MaybeUninit<BumpAllocator<VM>>; MAX_BUMP_ALLOCATORS],
+    pub bump_pointer: Vec<BumpAllocator<VM>>,

The constants MAX_BUMP_ALLOCATORS, MAX_IMMIX_ALLOCATORS, etc. will be removed. VM bindings no longer needs to ensure it matches the definition in mmtk-core.

When creating a Mutator, we still need to populate the Allocators struct according to the plan-specific SpaceMapping. But this time it can be easier because we can just use Vec::push instead of MaybeUninit::write.

The method Allocator::get_allocator and Allocator::get_allocator_mut can be made safe. It is currently unsafe because they use MaybeUninit::assume_init_ref() and MaybeUninit::assume_init_mut(), and neither of them have the ability to check whether a given MaybeUninit<T> is actually initialized or not. With the list of allocators becoming Vec, we can simply use the Vec::get() and Vec::get_mut() methods which return Option<&T> and Option<&mut T>, respectively. This will reduce up to 15 call sites in allocators and mutator prepare/release functions that involve the unsafe keyword. We'll discuss this in a separate issue.

Other fields

Other fields need no changes. But we no longer need to label them as #[repr(C)]. Concretely,

The Mutator::barrier field is the only one that needs some discussion. It is currently a Box<dyn Barrier>, and it can remain this way. Because the concrete barrier to use depends on the Plan which is selected at start-up time, some form of dynamic dispatch is always needed.

We previously discussed optimizing for write-barrier slow paths by embedding some data structures in TLS. Currently we only have an object-remembering barrier (and a field-remembering barrier in the lxr branch). To implement those kinds of barriers, we only need a bump-pointer, too. In this case, it is a cursor: *mut ObjectReference and a limit: *const ObjectReference, which is basically a bump-pointer into the ModBuf. Because this is strictly an optimization over the status quo, we may consider it as a separate issue.

Performance

The only thing that may affect performance is the Vec<BumpAllocator<VM>> and other vector of allocators in the Allocators struct. We consider the overhead of Vec reasonable because we will only access the complete allocators in slow paths. But we need to measure it anyway.

API changes

Because allocators are now held in Vec, the "offset from the start of Mutator to the concrete ImmixAllocator" and the "offset from the start of Mutator to a BumpPointer" no longer make sense. However, given a valid AllocatorSelector, we can still uniquely identify the address of the allocator. The Allocator::get_allocator method will still work, but AllocatorInfo won't work. We may refactor AllocatorInfo so that it returns references to the underlying BumpPointer instances rather than the offsets.

It is debatable whether we still allow a mutator thread's TLS to keep a pointer to the BumpPointer. Currently we give the VM binding full control over the Mutator as if it were a C struct. After this refactoring, the Box<Mutator> returned from memory_manager::bind_mutator() will be the only owning reference of the Mutator, and Rust doesn't like having another mutable reference into it (which violates the "unique reference" semantics of Box<T> and &mut T). Currently this only affects the Ruby binding. The ART binding is unaffected because it is moving the BumpPointer out of the allocator for fast-path allocation, and put it back when falling back to slow path.

Documentation

We should update our porting guide and remove the section about embedding the entire Mutator struct. The VM bindings should either not embed anything or embed only the BumpPointer struct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions