Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 82 additions & 26 deletions src/vm/slot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,44 +11,100 @@ use atomic::Atomic;
use crate::util::constants::{BYTES_IN_ADDRESS, LOG_BYTES_IN_ADDRESS};
use crate::util::{Address, ObjectReference};

/// A `Slot` represents a slot in an object (a.k.a. a field), on the stack (i.e. a local variable)
/// or any other places (such as global variables). A slot may hold an object reference. We can
/// load the object reference from it, and we can update the object reference in it after the GC
/// moves the object.
/// `Slot` is an abstraction for MMTk to load and update object references in memory.
///
/// For some VMs, a slot may sometimes not hold an object reference. For example, it can hold a
/// special `NULL` pointer which does not point to any object, or it can hold a tagged
/// non-reference value, such as small integers and special values such as `true`, `false`, `null`
/// (a.k.a. "none", "nil", etc. for other VMs), `undefined`, etc.
/// # Slots and the `Slot` trait
///
/// This intends to abstract out the differences of reference field representation among different
/// VMs. If the VM represent a reference field as a word that holds the pointer to the object, it
/// can use the default `SimpleSlot` we provide. In some cases, the VM need to implement its own
/// `Slot` instances.
/// In a VM, a slot can contain an object reference or a non-reference value. It can be in an
/// object (a.k.a. a field), on the stack (i.e. a local variable) or in any other places (such as
/// global variables). It may have different representations in different VMs. Some VMs put a
/// direct pointer to an object into a slot, while others may use compressed pointers, tagged
/// pointers, offsetted pointers, etc. Some VMs (such as JVM) have null references, and others
/// (such as CRuby and JavaScript engines) can also use tagged bits to represent non-reference
/// values such as small integers, `true`, `false`, `null` (a.k.a. "none", "nil", etc.),
/// `undefined`, etc.
///
/// In MMTk, the `Slot` trait is intended to abstract out such different representations of
/// reference fields (compressed, tagged, offsetted, etc.) among different VMs. From MMTk's point
/// of view, **MMTk only cares about the object reference held inside the slot, but not
/// non-reference values**, such as `null`, `true`, etc. When the slot is holding an object
/// reference, we can load the object reference from it, and we can update the object reference in
/// it after the GC moves the object.
///
/// # The `Slot` trait has pointer semantics
///
/// A `Slot` value *points to* a slot, and is not the slot itself. In fact, the simplest
/// implementation of the `Slot` trait ([`SimpleSlot`], see below) can simply contain the address of
/// the slot.
///
/// A `Slot` can be [copied](std::marker::Copy), and the copied `Slot` instance points to the same
/// slot.
///
/// # How to implement `Slot`?
///
/// If a reference field of a VM is word-sized and holds the raw pointer to an object, and uses the
/// 0 word as the null pointer, it can use the default [`SimpleSlot`] we provide. It simply
/// contains a pointer to a memory location that holds an address.
///
/// ```rust
/// pub struct SimpleSlot {
/// slot_addr: *mut Atomic<Address>,
/// }
/// ```
///
/// In other cases, the VM need to implement its own `Slot` instances.
///
/// For example:
/// - The VM uses compressed pointer (Compressed OOP in OpenJDK's terminology), where the heap
/// size is limited, and a 64-bit pointer is stored in a 32-bit slot.
/// - The VM uses tagged pointer, where some bits of a word are used as metadata while the rest
/// are used as pointer.
/// - A field holds a pointer to the middle of an object (an object field, or an array element,
/// or some arbitrary offset) for some reasons.
/// - The VM uses **compressed pointers** (Compressed OOPs in OpenJDK's terminology), where the
/// heap size is limited, and a 64-bit pointer is stored in a 32-bit slot.
/// - The VM uses **tagged pointers**, where some bits of a word are used as metadata while the
/// rest are used as pointer.
/// - The VM uses **offsetted pointers**, i.e. the value of the field is an address at an offset
/// from the [`ObjectReference`] of the target object. Such offsetted pointers are usually used
/// to represent **interior pointers**, i.e. pointers to an object field, an array element, etc.
///
/// If needed, the implementation of `Slot` can contain not only the pointer, but also additional
/// information. The `OffsetSlot` example below also contains an offset which can be used when
/// decoding the pointer. See `src/vm/tests/mock_tests/mock_test_slots.rs` for more concrete
/// examples, such as `CompressedOopSlot` and `TaggedSlot`.
///
/// When loading, `Slot::load` shall decode its internal representation to a "regular"
/// `ObjectReference`. The implementation can do this with any appropriate operations, usually
/// shifting and masking bits or subtracting offset from the address. By doing this conversion,
/// MMTk can implement GC algorithms in a VM-neutral way, knowing only `ObjectReference`.
/// ```rust
/// pub struct OffsetSlot {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that a Slot like this will not be word-sized so may not have the best performance. It should be flagged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think this is why the OpenJDK binding uses tagged word instead of Rust enum to distinguish between narrow and wide slots.

/// slot_addr: *mut Atomic<Address>,
/// offset: usize,
/// }
/// ```
///
/// When loading, `Slot::load` shall load the value from the slot and decode the value into a
/// regular `ObjectReference` (note that MMTk has specific requirements for `ObjectReference`, such
/// as being aligned, pointing inside an object, and cannot be null. Please read the doc comments
/// of [`ObjectReference`] for details). The decoding is VM-specific, but usually involves removing
/// tag bits and/or adding an offset to the word, and (in the case of compressed pointers) extending
/// the word size. By doing this conversion, MMTk can implement GC algorithms in a VM-neutral way,
/// knowing only `ObjectReference`.
///
/// When GC moves object, `Slot::store` shall convert the updated `ObjectReference` back to the
/// slot-specific representation. Compressed pointers remain compressed; tagged pointers preserve
/// their tag bits; and offsetted pointers keep their offsets.
///
/// # Performance notes
///
/// The methods of this trait are called on hot paths. Please ensure they have high performance.
/// Use inlining when appropriate.
///
/// Note: this trait only concerns the representation (i.e. the shape) of the slot, not its
/// semantics, such as whether it holds strong or weak references. If a VM holds a weak reference
/// in a word as a pointer, it can also use `SimpleSlot` for weak reference fields.
/// The size of the data structure of the `Slot` implementation may affect the performance as well.
/// During GC, MMTk enqueues `Slot` instances, and its size affects the overhead of copying. If
/// your `Slot` implementation has multiple fields or uses `enum` for multiple kinds of slots, it
/// may have extra cost when copying or decoding. You should measure it. If the cost is too much,
/// you can implement `Slot` with a tagged word. For example, the [mmtk-openjdk] binding uses the
/// low order bit to encode whether the slot is compressed or not.
///
/// [mmtk-openjdk]: https://github.com/mmtk/mmtk-openjdk/blob/master/mmtk/src/slots.rs
///
/// # About weak references
///
/// This trait only concerns the representation (i.e. the shape) of the slot, not its semantics,
/// such as whether it holds strong or weak references. Therefore, one `Slot` implementation can be
/// used for both slots that hold strong references and slots that hold weak references.
pub trait Slot: Copy + Send + Debug + PartialEq + Eq + Hash {
/// Load object reference from the slot.
///
Expand Down