-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data race / use after free reported by Miri #76
Comments
Alright. Seems like it's time to go over the proof again, trying to find the missing edge :-|. |
Just letting everyone know. This one is probably going to take a while, and I'm getting around to it in spare half an hour now and then. I hope to find out what's happening eventually. |
No result yet, but some preliminary findings if someone wants to have a look too, or for me to have something to go back to if I forget.
The feeling is that it is related to:
Miri seems to suggest there is a way for the increment either not happen or to get past taking the debt out or something like that. That part needs further investigation. |
It might be worth checking if the issue reproduces with The key is to run Miri with many different seeds, via something like
|
|
Here are the outdated atomic loads running up to the UAF. There are only 3. (I'm unconditionally cutting off the top two stack frames to show the actual
|
In that diagnostics trace there were two outdated reads getting the same value in this function arc-swap/src/strategy/hybrid.rs Lines 42 to 61 in f7f192d
I'm completely unfamiliar with |
Hello. Sorry for letting this one aside for a while, no free time lately :-|. Just came here to answer, I'll try to prioritize this as much as possible. I'll have to go over the internals myself, over all the proofs there and such and find if there's an error there. Nevertheless, big thanks for the help, this really narrows the space where to look for and what to verify/fix. Maybe I did something wrong with the previous experiments, but I've tried to replace all orderings with SeqCst which didn't seem to help. I'll need to find out why :-O. |
I think I'm starting to understand the problem (not the solution yet; really going over all the orderings in the library and changing them to The library is inspired by hazard pointers. If there's a pointer that shouldn't be done bad things to, such pointer is stored somewhere (in a slot) and other threads can check if their pointer is there. I call it a debt (because the reference count on that Arc is smaller than it should be). This is what's happening in the usual case when there are no collisions, corner cases, etc: A writer thread does these operations:
A reader thread:
Because both the swap and the store of the debt are I suspect no today HW neither code-gen can make the distinction between release->acquire on the same variable and on two different variables, it simply makes sure everything is written out to main memory on any release and read from memory on acquire ‒ tracking the individual variables would be impractical. That would explain why I have never seen the problem manifest even on weak platforms like multi-core ARM. Not that it would be an excuse to have an UB 😇. So, can someone confirm my understanding of the problem makes sense and this could be the cause? Does miri make this kind of tracking of which variable the Release & Acquire happens on? If so, my plan would be:
|
Thanks for the explanation. The summary of writer and reader actions are very helpful to understand the core things at play here, although I'm still not entirely certain where the bug is.
If (and only if) the acquiring load reads the value the releasing store has written, then yes it creates a synchronisation edge between them.
It does, although SeqCst does not provide any extra synchronisation opportunities than Acquire/Released does. There are two things at play: synchronisation ("happens-before" edge) and modification order ("reads-from" edge). A SeqCst load guarantees that you will always see the latest SeqCst store (should one exists) and nothing earlier. Since SeqCst also provides Acquire/Release semantics, the load creates a synchronisation edge with the earlier store, but not with any other thread, even if the same (or different) variable has been modified in that other thread before the last SeqCst store.
(sorry for the terrible alignment) Whereas if T3's read of x is Acquire (and the two writes remain SeqCst), then it could read 1 or 2, and creates a synchronisation edge with T1 or T2 respectively, but again, not both. There is no way to create a synchronisation edge with two threads simultaneously (edit: with accesses. Fences can create multiple edges). In the formalisations of the memory model, all relationships are binary.
Yes, although a synchronisation edge affects everything before the releasing action in the releasing thread, and everything after the acquiring action in the acquiring thread. There is no synchronisation edge that only provides guarantees on a single location.
You cannot create a release->acquire relationship across two different locations (whom did the load acquire a value from?). But as I mentioned above, if you do have a synchronisation edge from a paired up release->acquire on one memory location, all actions before the release in one thread happen before all actions after the acquire in the other thread. |
This isn't directly relevant to the issue so I won't be long-winded about it, but this assumption is not correct. Microarchitectures in extremely common use today absolutely make this distinction. Atomic acquire and release are not typically implemented by flushing caches all the way to main memory; that would make atomic operations outrageously slow. Actual implementations are beyond the scope of this comment, but suffice it to say that they rely on communication and coordination between the caches associated with each core, with granularity much smaller than "all of memory". (Typically, a single cache line.) With this in mind, to reemphasize a point from the previous comment: Release/Acquire only does something when the Acquire reads the value written by the Release. That is, it's not even enough that they happen to be on the same variable; they have to agree on a particular write of that variable. SeqCst often doesn't help because it is subject to the same limitation. |
And on x86 (or other well-ordered architectures) that means all memory operations would be outrageously slow. You'd need probably 112 more registers to make that usable to the extent its used today. |
Thank you, you seem to have confirmed my suspicion even though I apparently didn't explain very well what I was mistaken about. I'll try again just for the record, but I think I understand what goes on in there right now.
Yesterday reading through the spec and your answer confirm that such understanding was wrong. I've wrongly counted on a synchronization edge from Sorry for the oversimplification with the flushing to memory. I know it's much more complex, I meant it as kind of conceptual flush of all changes so they are visible + conceptual checking for changes from others. That is, I was trying to figure out why the bug didn't manifest when stress-testing on ARM even though ARM is a weak platform. What I suspect is that ARM is still a bit stronger platform than the memory model and it in fact does provide the above edge. I think so because otherwise the HW would somehow have to make distinction between two So, now I get back to the paper and figure out how do I get the missing edge in there. |
... TODO: Describe rationale ... ... TODO: Is that all that's needed? ... Maybe a solution for #76.
... TODO: Describe rationale ... ... TODO: Is that all that's needed? ... Maybe a solution for #76.
... TODO: Describe rationale ... ... TODO: Is that all that's needed? ... Maybe a solution for #76.
I've released a fix. That is, this definitely fixes the issues reported by miri. I still will want to go over the proofs few more times and check if I made the wrong assumption in some more places. So I'll keep this open for now. |
Add missed trophy vorner/arc-swap#76 appears to have been missed
Add missed trophy vorner/arc-swap#76 appears to have been missed
I have seen #71, but Miri reports the same issue. In Miri, it exhibits itself through a UAF, though I assume the cause to be the same. Sometimes, Miri reports this as a data race, but I was only able to reproduce the UAF.
Log of the data race: https://miri.saethlin.dev/no-sb/ub?crate=arc-swap&version=1.5.0
To reproduce the UAF:
MIRIFLAGS='-Zmiri-disable-stacked-borrows' cargo miri test load_parallel
Stacked borrows has to be disabled because this crate contains stacked borrows violations.
Miri backtrace
If this is a false positive, you should open an issue on rust-lang/miri.
The text was updated successfully, but these errors were encountered: