-
Notifications
You must be signed in to change notification settings - Fork 76
Typed access to metadata #647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5763187
to
6b8d05f
Compare
6b8d05f
to
4a0d8ff
Compare
I pushed some draft code here. It is not yet ready for review. |
@wks The PR is ready for review again. The focus of the PR changed and there were some major changes since it was reviewed last time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage of atomic bit-and and bit-or is that they do not need a cmpxchg loop to implement, so can be more efficient than store or cmpxchg.
FromPrimitive::from_u8(self.fetch_ops_on_bits( | ||
data_addr, | ||
meta_addr, | ||
order, | ||
order, | ||
|x: u8| x & val.to_u8().unwrap(), | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetch_and
doesn't need cmpxchg. Just set irrelevant bits to 1 and they will not be affected.
FromPrimitive::from_u8(self.fetch_ops_on_bits( | |
data_addr, | |
meta_addr, | |
order, | |
order, | |
|x: u8| x & val.to_u8().unwrap(), | |
)) | |
let mask = meta_byte_mask(self) << lshift; | |
let opnd = val | !mask; | |
unsafe { T::fetch_and(meta_addr, opnd, order) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. In case you may be interested, the following is the assembly code of fetch_and()
for 1 bit side metadata when using fetch_ops_on_bits
(internally uses fetch_update
), and directly using fetch_and
:
with fetch_ops_on_bits
mmtk::util::alloc_bit::unset_alloc_bit_unsafe:
mov rcx, rdi
shr rdi, 6
shr cl, 3
movabs rsi, 13194139533312
mov al, byte, ptr, [rdi, +, rsi]
mov dl, -2
rol dl, cl
.LBB225_1:
mov ecx, eax
and cl, dl
lock cmpxchg, byte, ptr, [rdi, +, rsi], cl
jne .LBB225_1
ret
with fetch_and
mmtk::util::alloc_bit::unset_alloc_bit_unsafe:
mov rcx, rdi
mov rax, rdi
shr rax, 6
shr cl, 3
mov dl, -2
rol dl, cl
movabs rcx, 13194139533312
lock and, byte, ptr, [rax, +, rcx], dl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the introduction of fetch_and
and fetch_or
, some bit operations can be done more efficiently using them, especially set bits and clear bits.
We may change all call sites of store_atomic
into fetch_and_atomic
or fetch_or_atomic
.
Alternatively, we can add a special case for store_atomic
if bits_num_log == 1
. In this case, we can turn store_atomic
internally into fetch_and
or fetch_or
, depending on whether it is storing 1 or 0.
@@ -7,6 +6,6 @@ use std::sync::atomic::Ordering; | |||
impl VMGlobalLogBitSpec { | |||
/// Mark the log bit as unlogged (1 means unlogged) | |||
pub fn mark_as_unlogged<VM: VMBinding>(&self, object: ObjectReference, order: Ordering) { | |||
store_metadata::<VM>(self, object, 1, None, Some(order)) | |||
self.store_atomic::<VM, u8>(object, 1, None, order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-bit store_atomic
has to use fetch_update or cmpxchg. fetch_or_atomic
is more efficient for this.
src/policy/immortalspace.rs
Outdated
); | ||
if old_value == value { | ||
return false; | ||
} | ||
|
||
if compare_exchange_metadata::<VM>( | ||
&VM::VMObjectModel::LOCAL_MARK_BIT_SPEC, | ||
if VM::VMObjectModel::LOCAL_MARK_BIT_SPEC.compare_exchange_metadata::<VM, u8>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be refactored to use fetch_update
.
src/policy/largeobjectspace.rs
Outdated
); | ||
let mark_bit = old_value & mask; | ||
if mark_bit == value { | ||
return false; | ||
} | ||
if compare_exchange_metadata::<VM>( | ||
&VM::VMObjectModel::LOCAL_LOS_MARK_NURSERY_SPEC, | ||
if VM::VMObjectModel::LOCAL_LOS_MARK_NURSERY_SPEC.compare_exchange_metadata::<VM, u8>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be refactored to fetch_update
src/policy/largeobjectspace.rs
Outdated
); | ||
let new_val = old_val & !NURSERY_BIT; | ||
if compare_exchange_metadata::<VM>( | ||
&VM::VMObjectModel::LOCAL_LOS_MARK_NURSERY_SPEC, | ||
if VM::VMObjectModel::LOCAL_LOS_MARK_NURSERY_SPEC.compare_exchange_metadata::<VM, u8>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider refactoring to fetch_update
.
src/policy/mallocspace/global.rs
Outdated
@@ -316,9 +316,10 @@ impl<VM: VMBinding> MallocSpace<VM> { | |||
address, | |||
); | |||
|
|||
if !is_marked::<VM>(object, None) { | |||
// TODO: Why do we use non-atomic load here? | |||
if !unsafe { is_marked_unsafe::<VM>(object) } { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a bug. If it is not atomic, it is guaranteed to cause data race because tracing is parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it should be a benign race, and it should not cause any actual bug. There are some discussions here: #313
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When @steveblackburn said "atomic operation" in #313 (comment), I think he meant atomic read-modify-write (RMW) operation (such as cmpxchg, swap, fetch_xxx, etc.), and atomic RMW operations do prevent duplicate edges. In Java, all write operations to 32-bit variables and reference variables (even not volatile
) are atomic, in the sense that it disallows word tearing, and if a read does not have happens-before relationship with a write, the read always sees some benign values (written by an actual store). So we seldom use the phrase "atomic read" or "atomic write" in Java.
It is not true for C++ and Rust. In C++ and Rust, if a non-atomic load has no happens-before relationship with a store, it has undefined behaviour. The Rust counterpart of "ordinary" load/store in Java are load_atomic
and store_atomic
with the Relaxed
order (actually the Java memory order is even weaker than Relaxed, but must prevent word tearing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can change this to a relaxed atomic load, which is more reasonable.
But just for the sake of argument, I don't think this causes any actual bug. The issue #313 is clear that the if
block may be executed multiple times for the same object. So:
- if
is_marked()
returns false, but the object is actually marked: we will just execute the block again, and it is fine. - if
is_marked()
returns true, but the object is not actually marked: I don't think this will happen, as we are monotonically set the bit from 0 to 1 at this stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed this to a relaxed atomic load.
ordering, | ||
); | ||
pub fn set_mark_bit<VM: VMBinding>(object: ObjectReference, ordering: Ordering) { | ||
VM::VMObjectModel::LOCAL_MARK_BIT_SPEC.store_atomic::<VM, u8>(object, 1, None, ordering); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a one-bit metadata. May use fetch_or_atomic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the return value of cmpxchg...
src/util/metadata/global.rs
Outdated
/// * `success_order`: is the atomic ordering used if the operation is successful. | ||
/// * `failure_order`: is the atomic ordering used if the operation fails. | ||
/// | ||
/// # Returns `true` if the operation is successful, and `false` otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cmpxchg should return the old value and whether it is successful or not. The AtomicU8::compare_exchange
method returns Result<u8, u8>
. We should do it similarly here. Some algorithm can benefit from it and eliminate a subsequent load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this to Result<T, T>
. For its use cases, I will leave it as compare_exchange().is_ok()
.
@wks My plan is to fix whatever is about the implementation the metadata. For the usage of the metadata and any possible optimization on that, I will leave it to another PR. |
fetch_and/or for bits.
Use fetch_and/or to implement fetch_and/or on bits.
This pull request is ready for another review. @wks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR refactors metadata, and makes all its access associated with a num type. This PR also greatly reduces code duplication in the metadata module, and adds tests for both header metadata and side metadata access.
Metadata in the following context refers to both side metadata and header metadata.
HeaderMetadata
/SideMetadata
as methods.MetadataValue
trait so we can handleu8/16/32/64/size
uniformly.u64
metadata on 32 bits. It should work properly now.ObjectModel
now have a default implementation that uses our header metadata module.Related PRs: