-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ThreadId for comparing threads #36341
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @aturon (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
if THREAD_ID.0 == 0 { | ||
THREAD_ID.0 = 1 + THREAD_ID_COUNT.fetch_add(1, Ordering::SeqCst); | ||
} | ||
THREAD_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation allows for two distinct threads to have a same ID in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What case is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least
let main_thread_id = ThreadId::current();
loop {
let thrad2 = thread::spawn(|| { if ThreadId::current() == main_thread_id { println!("it happened") } });
}
To explain it, consider a usize = u16
system. Main thread gets ThreadId(1)
, then the threads in the loop receive IDs ThreadId(2)
...ThreadId(!0)
. The next thread will wrap the counter assigning ThreadId(1)
to the thread, making the still-running main thread and new thread to have colliding IDs. For larger usize systems, the issue will only take longer to manifest itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the only possibility for collision that I thought of. Would it be better to use a u64? If the number of threads you are creating is overflowing a 64 bits (highly unlikely, IMO), then there's not much we can really do, is there? What possible representation could we even use for thread IDs in that case? I can't imagine an OS being able to even handle up to u64-1 threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that Java is something I look up to, but here's how Java implements unique thread IDs:
private static synchronized long nextThreadID() {
return ++threadSeqNumber;
}
My gut intuition (without doing any calculating) is that It would take thousands of years of running for a loop like that spawning threads to overflow 64 bits. Highly unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number of threads you are creating is overflowing a 64 bits (highly unlikely, IMO), then there's not much we can really do, is there? … I can't imagine an OS being able to even handle up to u64-1 threads.
You’re assuming that all the threads should be alive at the same time in order for the collision to happen, which is simply not true.
Would it be better to use a u64?
Yes it would. At least then it wouldn’t take a usize = u32
target less than a week of spawning threads continuously to arrive at colliding the thread IDs.
What possible representation could we even use for thread IDs in that case?
Using the address of something unique to a thread in the address space common to all threads would be one way to get guarantee that thread IDs would not collide. (see #29448 (comment) though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar issue exists for reference counted pointers. As pointed out in that discussion, you don't actually need to run a googol increment instructions to overflow, if LLVM optimizes the relevant loop (this is certainly possible for Rc
and Arc
, I'm not so sure about thread creation, that probably is an observable effect in practice). However, memory safety isn't on the line here, so it's probably okay to slap a footnote on the docs and ignore it. (Using u64
might be better for robustness but you sadly can't use atomics for that.)
The approach in #29447 seems robust against collisions. Would it be ok to resubmit that branch? |
Thanks for the PR @sagebind! On the technical side of things, I agree with @nagisa that we'll want to handle the overflowing case here somehow if we take this strategy, especially for An alternative, implementation, however, would be to back this implementation from The only caveat I know of with this implementation, however, is that we'd have to clearly document that equality does not mean the same thread if one of the threads has exited. That is, if two thread ids are equal, they're probably only actually equal if the thread is currently running. I suspect that both Windows and Unix would recycle thread ids after they exit. |
@alexcrichton That was the primary reason I thought about avoiding using the native function calls for thread IDs, since the IDs are not guaranteed to be unique during an entire program lifetime; at least that is the case on Windows:
I think the bigger problem is
We could use If we want to use the approach in this PR, is it possible to use |
This seems like a desirable restriction though, no? If we could never recycle thread ids then that seems like hard limit to the lifetime of all rust programs in terms of how many threads they can ever produce.
Yeah I was under the impression we'd use struct ThreadId(libc::pthread_t);
impl PartialEq for ThreadId { ... }
That's a possible strategy, yeah, although I don't think
Awesome! And of course thanks again, we're willing to help out wherever necessary! |
Discussed at libs triage today the decision was that these seem like good methods to stabilize, with the strategy of using the OS primitives and perhaps exposing them later as well (not necessary as part of this PR). We were also thinking, @sagebind could you add an |
Using operating system thread primitives may be problematic for other reasons In practice it may not be that bad, given that usual implementation of |
#[unstable(feature = "thread_id", issue = "21507")] | ||
pub fn current() -> ThreadId { | ||
static THREAD_ID_COUNT: AtomicUsize = ATOMIC_USIZE_INIT; | ||
#[thread_local] static mut THREAD_ID: ThreadId = ThreadId(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread_local!
?
isn't #[thread_local]
unsupported on some platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that the thread_local!
macro was implemented using #[thread_local]
, but that may not be the case.
Since the value is atomic already, the complicated mutexes, keys, and locking provided by thread_local!
I felt were unnecessary overhead.
I'm still working on this PR and how to solve it. Using the system thread IDs seems more robust, since they are managed by the OS and can be re-used forever, but @tmiasko has a good point about lifetimes. Using a The struct issue mentioned earlier is not a problem, since interestingly enough
Windows is much easier (for Vista+); |
On MacOS X it looks like struct _opaque_pthread_t {
long __sig;
struct __darwin_pthread_handler_rec *__cleanup_stack;
char __opaque[__PTHREAD_SIZE__];
};
typedef struct _opaque_pthread_t *__darwin_pthread_t; (though you could interpret the pointer as a |
@ranma42 Good catch, on the Rust side pub type pthread_t = ::uintptr_t; A pointer type, so definitely issues if we call |
On further experimentation, I don't think we can use the native thread ID facilities and have the flexibility that we want with a Suppose we define pub struct ThreadId {
thread: libc::pthread_t,
}
impl PartialEq for ThreadId {
fn eq(&self, other: &ThreadId) -> bool {
unsafe {
libc::pthread_equal(self.thread, other.thread) != 0
}
}
} We want two ways to acquire a pub fn current_id() -> ThreadId {
ThreadId {
thread: libc::pthread_self(),
}
} As long as we make But now we have a problem with the lifetime of let join_handle = thread::spawn(|| {
thread::current()
});
let dead_thread = join_handle.join().unwrap();
println!("Thread name: {:?}", dead_thread.name());
assert!(thread::current().id() != dead_thread.id()); What should this code do? If we simply use the So I am currently of the opinion that we should avoid using the system thread IDs for this. By the time we change our resource lifetimes for the I attempted to study the source for other languages to see how they solve this problem, and I generally saw two approaches:
C# actually provides both ID types, and I think that might be the strategy we may want to take (in separate PRs). Provide an opaque ID for comparison's sake in this PR, and provide a way to return the Now to solve the integer overflow problem, I'm thinking the most efficient way that doesn't leak is with atomics, though I'm not a veteran in Rust's atomics. Is it reasonable to assume that most platforms have support for Or maybe a spinlock? |
Is it really undefined behavior to call |
Though, I am yet to see an implementation that is different from following one: int pthread_equal(pthread_t t1, pthread_t t2) {
return (t1 == t2);
} Edit: Another data point, implementation for Python just casts thread IDs as integers /* XXX This implementation is considered (to quote Tim Peters) "inherently
hosed" because:
- It does not guarantee the promise that a non-zero integer is returned.
- The cast to long is inherently unsafe.
- It is not clear that the 'volatile' (for AIX?) are any longer necessary.
*/
long
PyThread_get_thread_ident(void)
{
volatile pthread_t threadid;
if (!initialized)
PyThread_init_thread();
threadid = pthread_self();
return (long) threadid;
} |
FreeBSD at least defines // freebsd/sys/sys/_pthreadtypes.h
typedef struct pthread *pthread_t; // freebsd/lib/libthr/thread/thr_equal.c
int
_pthread_equal(pthread_t t1, pthread_t t2)
{
/* Compare the two thread pointers: */
return (t1 == t2);
} In theory we would be safe by using |
I've been using the
|
@tmiasko As far as I know the only pthread implementation that doesn't use a pointer for pthread_t is pthread-win32 (https://github.com/GerHobbelt/pthread-win32/blob/master/pthread.h#L575). (Note that nowdays most people use winpthreads instead of pthread-win32, which uses a simple pointer for pthread_t) |
Thanks for the investigation @tmiasko and @sagebind! The Python implementation is interesting as it seems to violate a super strict reading of the spec, but then again the fact that all but one implementation just uses integers makes me think that it's the pragmatic thing to do anyway. Given all that I'm tempted to ignore a pedantic interpretation of the spec (especially given the precedent) and go ahead with an implementation detail of |
Ok, got a chance to talk about this during libs triage today. Our conclusion was that due to these limitations and implementations elsewhere we should just go with a @sagebind would you be up for implementing this? |
@alexcrichton Sounds like a plan. I can implement this in my next iteration and will push it to this PR (I've got like 3 versions now 😀). I've also developed an algorithm for an atomic incrementing |
).is_err() { | ||
// Give up the rest of our thread quantum if another thread is | ||
// using the counter. This is the slow_er_ path. | ||
yield_now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a busy loop, could this just use a mutex for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best way would be to set up a mutex statically. Isn't StaticMutex
gone now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of libstd you can reach into sys_common::mutex::Mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like more time would be spent in the mutex syscalls than actually using the counter -- doesn't seem worth it. Using a spinlock, at least the normal case of no contention uses no syscalls.
There's no defined time where we could call Mutex::destroy()
, since a thread could be created at any time.
ONCE
uses a similar atomics approach as this if I recall for the same reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to in general not call Mutex::destroy
, and otherwise I don't think there's going to be any real impact here on a mutex vs a spinlock. While Once
uses atomics, it doesn't use spinlocks internally.
In general we try to avoid spinlocks wherever possible as there are generally more suitable synchronization primitives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like more time would be spent in the mutex syscalls than actually using the counter
Most of the relevant systems have fast userspace implementations which get used in low-contention conditions. Please use Mutex
or whatever based on Mutex
in order to avoid rolling out your own.
// If we somehow use up all our bits, panic so that we're not | ||
// covering up subtle bugs of IDs being reused. | ||
if COUNTER == ::u64::MAX { | ||
panic!("failed to generate unique thread ID: bitspace exhausted"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll want to unlock the lock before the thread panics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll fix that in my next commit.
@bors: r+ |
📌 Commit 032bffa has been approved by |
Add ThreadId for comparing threads This adds the capability to store and compare threads with the current calling thread via a new struct, `std::thread::ThreadId`. Addresses the need outlined in issue #21507. This avoids the need to add any special checks to the existing thread structs and does not rely on the system to provide an identifier for a thread, since it seems that this approach is unreliable and undesirable. Instead, this simply uses a lazily-created, thread-local `usize` whose value is copied from a global atomic counter. The code should be simple enough that it should be as much reliable as the `#[thread_local]` attribute it uses (however much that is). `ThreadId`s can be compared directly for equality and have copy semantics. Also see these other attempts: - #29457 - #29448 - #29447 And this in the RFC repo: rust-lang/rfcs#1435
@bors: retry force clean |
1 similar comment
@bors: retry force clean |
This passed on most platforms and then buildbot got stuck, merging manually. |
I'm confused about using a |
How long would you estimate it take to create 19 quintillion threads? |
Put more concretely, even creating one thread every nanosecond, it would take 584 years to overflow the thread ID. |
ah yes, I should have done the math. If you created and destroyed 1000 threads per second it would take 584 million years to overflow the value:
That would be quite the rust program! Or, as you said one per nanosecond would take 584 years |
This could be a problem when we have 10 million core computers that run programs for hundreds of years. I think we can safely say we are far enough away from that. |
Also note that this is just an implementation detail currently, we don't expose the u64 itself. |
Rayon uses a neat trick to get a non-zero |
@alexcrichton the overflow is an implementation detail that would be trivial to change, but the current implementation also guarantees that no id will be re-used ever. Users of the API might end up relying on this even though the documentation guarantees that the id is only unique among running threads. |
@ranma42 I think maybe the docs should change to also guarantee that IDs will not be reused; that's a strong guarantee, and one that makes thread IDs more useful in my opinion. |
I disagree: I think the implementation should allow thread IDs to be reused since this allows a more efficient implementation of thread IDs in the future. This matches the way every other thread ID implementation works (Windows thread IDs, pthread_t with pthread_equals, Java Thread.getId(), etc all do not make this guarantee). |
This adds the capability to store and compare threads with the current calling thread via a new struct,
std::thread::ThreadId
. Addresses the need outlined in issue #21507.This avoids the need to add any special checks to the existing thread structs and does not rely on the system to provide an identifier for a thread, since it seems that this approach is unreliable and undesirable. Instead, this simply uses a lazily-created, thread-local
usize
whose value is copied from a global atomic counter. The code should be simple enough that it should be as much reliable as the#[thread_local]
attribute it uses (however much that is).ThreadId
s can be compared directly for equality and have copy semantics.Also see these other attempts:
And this in the RFC repo: rust-lang/rfcs#1435