-
Notifications
You must be signed in to change notification settings - Fork 389
Prevent futex_wait from actually waiting if a concurrent waker was executed before us #2228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// Together with the SeqCst fence in futex_wait, this makes sure that futex_wait | ||
// will see the latest value on addr which could be changed by our caller | ||
// before doing the syscall. | ||
this.atomic_fence(&[], AtomicFenceOp::SeqCst)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue is that previously, futex_wake didn't actually do anything on addr
, so it was not ordered wrt other operations on addr
-- and this extra fence here inserts it into the SeqCst order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
futex wake syscall is not supposed to do anything on addr
. The API user is supposed to change its value before invoking the wake syscall.
When releasing the lock, a thread has to first reset the lock state to not acquired and then execute a futex operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so what is the correctness argument here for a racing wait and wake? We have two SC fences, so the thing weare allowed to do is case distinction on which fence comes first.
- In case the wake fence goes first, the wait fence will establish a happens-before from that fence, so the read in wait will definitely pick up all writes before the wake. Therefore it will not sleep and we are good.
- In case the wait fence goes first... then what? Since all of wait is atomic and cannot be preempted, we know the waiting thread will be in the futex wait set, so the waking thread will see and wake it. But this relies on all of wait (fence, read, add to wait set) being non-preemptible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes your reasoning are right for both cases. Except for that the fences don't actually establish happens-before, it's just a special property unique to SC fence (cannot be replicated with acquire/release fences):
T1: T2:
na = 10
M.store(42, Ordering.any) // A
fence(sc) // X
fence(sc) //Y
M.load(Ordering.any) // B
r = na // still a data race, does not happen after na = 10
The load will read 42 as long as T2's fence and load are interleaved after T1's fence, but it does not establish HB with anything, though this doesn't matter for our case.
For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B observes either the effects of A or a later modification of M in its modification order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, that is wild. So it's a "definitely visible but not happens-before" edge? I have no idea how that makes any sense but it's good enough for our purposes here.
Can you please add a comment with this correctness argument to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, just to be sure I understand -- earlier you said "It doesn't matter if there is a preemption between the fence and read", but in my correctness argument we crucially use that fence+read are atomic. So it does matter, after all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, the fence+read also have to be atomic. I've added the correctness argument as comments
let futex_val = this | ||
.read_scalar_at_offset_atomic( | ||
&addr.into(), | ||
0, | ||
this.machine.layouts.i32, | ||
AtomicReadOp::SeqCst, | ||
AtomicReadOp::Relaxed, | ||
)? | ||
.to_i32()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand why SeqCst fence + relaxed access is more correct than SeqCst access. Both should get inserted into the total SeqCst order, no?
OTOH, I seem to remember my weak memory colleagues saying that when mixing multiple orderings on one location, SeqCst accesses become ill-behaved very quickly, and only SeqCst fences still make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if this relies on the fact that the fence+load happen atomically and cannot be preempted, please add a comment explaining that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not care about the synchronisation effect of accessing addr
has on other memory locations; this is done by the validate_lock_acquire
call in this.futex_wait()
. Futex API implementations are not allowed to write to addr
since it's "owned" by the user program, so I doubt it's possible to create synchronisation using addr
anyway.
The sole location we want the fence to affect is addr
itself, therefore the ordering for the read can be anything (though it still has to be atomic because the fences don't actually create an HB edge, it's just a special case of SC fence guaranteeing what the next atomic read can see).
Of course, the API users can do extra accesses on addr
and expect some ordering guarantees, but having an extra relaxed read here should not weaken it, otherwise it's a very surprising behaviour and a bug in the memory order.
It doesn't matter if there is a preemption between the fence and read. We must not preempt between the read and comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Futex API implementations are not allowed to write to addr since it's "owned" by the user program, so I doubt it's possible to create synchronisation using addr anyway.
That is a good argument.
It doesn't matter if there is a preemption between the fence and read. We must not preempt between the read and comparison.
The comparison is completely thread-local, so I don't understand this comment. Or do you mean that it is important that the thread be put to sleep atomically with the read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or do you mean that it is important that the thread be put to sleep atomically with the read?
Yes, otherwise there's a TOCTOU bug
const HELD: i32 = 1; | ||
fn concurrent_wait_wake() { | ||
static FUTEX: AtomicI32 = AtomicI32::new(0); | ||
for _ in 0..100 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100 times is a lot.
What about reducing this (to make the test not take quite as long), and bumping up the preemption rate for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the test currently already fails in the 3rd iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this even about the preemption rate? If it is about picking 1 out of 2 entries in the store buffer, then 20 rounds make the probability of accidentally passing the test 2^-20, which seems easily good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to get the wake to run before the wait to trigger this bug, but yeah 20 iterations should be enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to get the wake to run before the wait
So preemption is actually harmful? Should we set the preemption rate to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think leaving it as-is is fine. Having preemption also demonstrates that the critical sections inside futex syscalls are happening atomically
Thanks for fixing this nasty issue! |
📌 Commit 737a5b3 has been approved by |
☀️ Test successful - checks-actions |
Fixes #2223
Two SC fences were placed in
futex_wake
(after the caller has changedaddr
), and infutex_wait
(before we readaddr
). This guarantees thatfutex_wait
sees the value written toaddr
before the lastfutex_wake
call, should one exists, and avoid going into sleep with no one else to wake us up.miri/src/concurrency/weak_memory.rs
Lines 324 to 326 in ada7b72
Earlier I proposed to use
fetch_add(0)
to read the latest value in MO, though this isn't the proper way to do it and breaks aliasing: syscall caller may pass in a*const
from a&
and Miri complains about write to aSharedReadOnly
location, causing this test to fail.miri/tests/pass/concurrency/linux-futex.rs
Lines 56 to 68 in ada7b72