Description
Proposed API
- Atomic bit operations:
fetch_and_metadata(metadata_spec, object, val, order)
: atomically and- Maybe also provide
fetch_clear_metadata(m, o, v, order)
as an alias offetch_and(m, o, !v, order)
- Maybe also provide
fetch_or_metadata(metadata_spec, object, val, order)
: atomically or- Maybe also provide
fetch_set_metadata(m, o, v, order)
as a synonym
- Maybe also provide
fetch_xor_metadata(metadata_spec, object, val, order)
: atomically xor
fetch_update_metadata(metadata_spec, object, |old| { compute(old) }, order1, order2)
: Update the update atomically using a closure.
Problem with compare_exchange
It needs a loop, and is somewhat complex when modifying only some bits of a byte.
Compare-exchange is usually used in a loop:
- Load the old value
- Compute the new value
- Use
compare_exchange
to try to update it from the old value to the new value- If we only update some bits of a byte, we need another load here.
- Retry if failed
Some details can be seen in the discussion in #644
Alternatives
On the architecture level,
- Some platforms have instructions that does atomic add/sub/and/or/nor, but never fails (i.e. no need to use it in a loop).
- Some platforms (almost all platforms except x86) use LL-SC as the primitive instead of compare-exchange. On such platforms, compare-exchange itself is usually implemented with LL-SC.
In Rust,
AtomicU{8,16,32,64,size}
providefetch_{add,sub,and,nand,or,xor,min,max}
. (Obviously they are wrappers of LLVM'satomicrmw
instruction.)AtomicU{8,16,32,64,size}
providefetch_update
that takes a call-back function that computes the new value from the old.
Using non-failing fetch_xxx operations for bit operations
There are two advantages of using fetch_and
, fetch_or
and fetch_xor
instead of compare_exchange.
- They never fail, so we don't need to run in a loop.
- Unlike
add
andsub
, those bit operations do not affect unrelated bits, so we don't need to implement them with compare-exchange.
For example, if a metadata is one bit per object, and we want to set the n-th bit of an AtomicU8
, we just call x.fetch_or(1 << n, order)
.
If a metadata is two bits per object, and we want to AND bit n
and bit n+1
with a user-supplied value v
, we do:
assert_eq!(v & !0b11, 0, "v has more than 2 bits");
x.fetch_and(v << n, order);
As long as the input v
has all 0 bits above the lowest two bits, we are sure x.fetch_and
won't affect any other bits.
Using fetch_and
, the compare-exchange in the object remembering barrier
compare_exchange_metadata::<E::VM>(&self.meta, object, 1, 0, None, Ordering::SeqCst, Ordering::SeqCst)
can be re-written as
fetch_and_metadata::<E::VM>(&self.meta, object, 0, Ordering::SeqCst)
and it doesn't need a loop.
Using fetch_update for other atomic read-modify-write operations.
We can design the API similar to Rust's fetch_update
of AtomicU8
types.
One advantage is that the compiler can implement AtomicU8::fetch_update
as an LL-SC loop, without needing a separate CAS instruction or instruction sequence. I have seen the Rust compiler doing this on RISC-V.
The other is that we can omit a load. Because fetch_update
surrounds the closure, it can load the old value before running the closure, and reuse the loaded value after running the closure.
For example, with this API, even if we still use compare-exchange internally, we can have one less load.
fn fetch_update_bits(&x: AtomicU8, bits: u8, shift: u8, update: Fn(u8) -> u8)
loop {
let old_byte = x.load(); // We are going to reuse this.
let mask = (1 << bits) - 1;
let old_interesting_bits = (old_bytes >> shift) & mask;
let new_interesting_bits = update(old_interesting_bits);
assert_eq!(new_interesting_bits & !mask, 0, "new value has two many bits");
let new_byte = (old_byte & !(mask << shift)) | (new_interesting << shift); // Reusing old_byte here.
if x.compare_exchange(old_byte, new_byte) { break; }
}
}
We can also just use AtomicU8.fetch_update
.
fn fetch_update_bits(&x: AtomicU8, bits: u8, shift: u8, update: Fn(u8) -> u8)
x.fetch_update(|old_byte| {
let mask = (1 << bits) - 1;
let old_interesting_bits = (old_bytes >> shift) & mask;
let new_interesting_bits = update(old_interesting_bits);
assert_eq!(new_interesting_bits & !mask, 0, "new value has two many bits");
let new_byte = (old_byte & !(mask << shift)) | (new_interesting << shift); // Not loading again.
Some(new_byte)
}).unwrap();
}
Using this API, the object-remembering barrier can be implemented as
fetch_update_metadata::<E::VM>(&self.meta, object, |_old| { 0 }, Ordering::SeqCst, Ordering::SeqCst)
But fetch_and
should be preferred for better efficiency.