Description
The reason why the Mutator
struct was made #[repr(C)]
was to make it possible for bindings to embed Mutator
into their own C/C++ structs in thread-local storage. This is supposed to improve performance by allowing inlined fast-path code to access important fields (such as BumpPointer
) quickly from the TLS. However, embedding the entire Mutaotr
is neither practical nor necessary.
It is impractical because
- Because the
Mutator
struct is so big, it is beyond the addressable range of immediate addressing in some architectures, such as RISC-V. There is already a proposal to makeMutator
smaller. - The C/C++ part of the VM binding has to precisely replicate the
Mutator
struct. This is tedious and error-prone, but both the OpenJDK and the Julia binding have done that. And the presence of Rust-specific data types, such as&dyn T
, makes it hard to get the object layout right. - To make it
repr(C)
, theAllocators
sub-struct has to rely on fixed-size arrays[T; N]
. By doing this, the number of allocators of each kind are limited and unknown at run time.- Changing the maximum number of each allocator will alter the layout of
Mutator
, resulting in a series of cascading changes in VM bindings that depend on the precise layout, such as this PR and its associated changes in the OpenJDK and Julia bindings. - Because the actual number of each kind of allocator is determined by the concrete Plan, the
Allocators
struct has to useMaybeUninit
(like[MaybeUninit<BumpAllocator<VM>>; MAX_BUMP_ALLOCATORS]
) so that array elements can remain uninitialized if the concrete plan doesn't need that many allocators of that kind. This is also the reason why theAllocators::get_allocator
andAllocators::get_allocator_mut
have to be madeunsafe
.
- Changing the maximum number of each allocator will alter the layout of
And it is not necessary to embed the entier Mutator
into the TLS. Actually, the allocation fast path only needs the BumpPointer
struct. The VM binding only needs to embed the BumpPointer
struct. This practice is already documented in the Porting Guide, and there is an API function AllocatorInfo::new(selector)
for getting the offset of the BumpPointer
struct. The Ruby binding didn't embed the BumpPointer
, but just lets the TLS keep a pointer to the BumpPointer
inside the Mutator
without depending on the concrete structure of Mutator
, and the ART binding manually synchronizes the BumpPointer
in the Mutator
with the thread-local storage.
The proposal: Remove repr(C)
from Mutator
We remove the #[repr(C)]
annotation on Mutator
. The C/C++ part of the VM binding can no longer assume the layout of Mutator
. But we keep a bottom line that:
- For every data structure that needs to be accessed in fast paths (such as
BumpPointer
), we provide an API function for returning the pointer to (or offset of) that structure so that the VM binding can get the pointer without knowing the layout of Mutator. Currently, this can be achieved with eithermutator_address + AllocatorInfo::new(selector).bump_pointer_offset
or&mutator.allocator_impl<ConcreteAllocator>(selector).bump_pointer
. - We allow the related fast-path data structures (such as
BumpPointer
) to be taken out of theMutator
and put back later. This will allow the VM binding to keep the performance-sensitive parts in the TLS as long as possible because we assume they will be used by the inlined fast paths most of the time.
The Allocators
struct
The struct Allocators
no longer needs to be repr(C)
, either. It was repr(C)
because it was part of Mutator
.
Because it doesn't need to be repr(C)
, its layout no longer needs to be exposed to the VM binding. Now we can use Vec<T>
instead of [T; N]
, for example,
- pub bump_pointer: [MaybeUninit<BumpAllocator<VM>>; MAX_BUMP_ALLOCATORS],
+ pub bump_pointer: Vec<BumpAllocator<VM>>,
The constants MAX_BUMP_ALLOCATORS
, MAX_IMMIX_ALLOCATORS
, etc. will be removed. VM bindings no longer needs to ensure it matches the definition in mmtk-core.
When creating a Mutator
, we still need to populate the Allocators
struct according to the plan-specific SpaceMapping
. But this time it can be easier because we can just use Vec::push
instead of MaybeUninit::write
.
The method Allocator::get_allocator
and Allocator::get_allocator_mut
can be made safe. It is currently unsafe because they use MaybeUninit::assume_init_ref()
and MaybeUninit::assume_init_mut()
, and neither of them have the ability to check whether a given MaybeUninit<T>
is actually initialized or not. With the list of allocators becoming Vec
, we can simply use the Vec::get()
and Vec::get_mut()
methods which return Option<&T>
and Option<&mut T>
, respectively. This will reduce up to 15 call sites in allocators and mutator prepare/release functions that involve the unsafe
keyword. We'll discuss this in a separate issue.
Other fields
Other fields need no changes. But we no longer need to label them as #[repr(C)]
. Concretely,
The Mutator::barrier
field is the only one that needs some discussion. It is currently a Box<dyn Barrier>
, and it can remain this way. Because the concrete barrier to use depends on the Plan which is selected at start-up time, some form of dynamic dispatch is always needed.
We previously discussed optimizing for write-barrier slow paths by embedding some data structures in TLS. Currently we only have an object-remembering barrier (and a field-remembering barrier in the lxr
branch). To implement those kinds of barriers, we only need a bump-pointer, too. In this case, it is a cursor: *mut ObjectReference
and a limit: *const ObjectReference
, which is basically a bump-pointer into the ModBuf. Because this is strictly an optimization over the status quo, we may consider it as a separate issue.
Performance
The only thing that may affect performance is the Vec<BumpAllocator<VM>>
and other vector of allocators in the Allocators
struct. We consider the overhead of Vec
reasonable because we will only access the complete allocators in slow paths. But we need to measure it anyway.
API changes
Because allocators are now held in Vec
, the "offset from the start of Mutator
to the concrete ImmixAllocator
" and the "offset from the start of Mutator
to a BumpPointer
" no longer make sense. However, given a valid AllocatorSelector
, we can still uniquely identify the address of the allocator. The Allocator::get_allocator
method will still work, but AllocatorInfo
won't work. We may refactor AllocatorInfo
so that it returns references to the underlying BumpPointer
instances rather than the offsets.
It is debatable whether we still allow a mutator thread's TLS to keep a pointer to the BumpPointer
. Currently we give the VM binding full control over the Mutator
as if it were a C struct. After this refactoring, the Box<Mutator>
returned from memory_manager::bind_mutator()
will be the only owning reference of the Mutator
, and Rust doesn't like having another mutable reference into it (which violates the "unique reference" semantics of Box<T>
and &mut T
). Currently this only affects the Ruby binding. The ART binding is unaffected because it is moving the BumpPointer
out of the allocator for fast-path allocation, and put it back when falling back to slow path.
Documentation
We should update our porting guide and remove the section about embedding the entire Mutator
struct. The VM bindings should either not embed anything or embed only the BumpPointer
struct.