-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Use pthread_join from musl #15604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pthread_join from musl #15604
Conversation
Note: this commit is no longer needed once PR emscripten-core#15604 lands.
Note: this commit is no longer needed once PR emscripten-core#15604 lands.
4a99292
to
41d1373
Compare
hidden long __cancel(), __syscall_cp_asm(), __syscall_cp_c(); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this patch should be needed .. declaring extra stuff is petty harmless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, GitHub does not allow me to stack PRs when I use my own fork to base my branch on (instead of pushing the branches to the main repository). Hence, this PR is currently easiest to review per commit.
#ifdef __EMSCRIPTEN__ | ||
// Attempt to detach a thread which does not point to a valid thread, or | ||
// does not exist anymore. | ||
if (!t || t->self != t) return ESRCH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the test for NULL here needed somewhere? Most other pthread function don't have this kind of check IIRC.
If it is needed perhaps this can be its own change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this change was that the previous JS code did check for this, see e.g:
emscripten/src/library_pthread.js
Lines 906 to 909 in 85d58ff
if (!thread) { | |
err('pthread_detach attempted on a null thread pointer!'); | |
return {{{ cDefine('ESRCH') }}}; | |
} |
I removed these NULL checks with commit 80a3e11 as it does indeed looks redundant.
@@ -187,10 +187,6 @@ int __pthread_cond_timedwait(pthread_cond_t *restrict c, pthread_mutex_t *restri | |||
__pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, 0); | |||
} | |||
|
|||
#ifdef __EMSCRIPTEN__ | |||
pthread_testcancel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed, or just generally part of a cleanup? If it not needed as part of this change can you split it out into its own change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #15603 (comment) for the reason for this change, I just moved this to PR #15657.
41d1373
to
80a3e11
Compare
80a3e11
to
64257a6
Compare
6872d66
to
fc636df
Compare
fc636df
to
8cf9c8b
Compare
I wonder if we could split the testcancel change into its own PR? It seems unrelated to join, and a lot more straight forward, unless I'm missing something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'm excited to see this land, but maybe we can split it one more time
return ESRCH; | ||
#ifdef __EMSCRIPTEN__ | ||
// Attempt to detach a thread which does not point to a valid thread, or | ||
// does not exist anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep the old comment.. its seems more useful in that it says why we are doing it. Is it not true?
The new comment does say why emscripten needs this patch..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the doc comment change was that the "(already joined)"-sentence wasn't quite right. The check is there only for invalid or non-existent threads (AFAIK).
I reverted it partially with commit 1e1761b, so that it indicates why the patch is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh BTW, sometimes there are 2 spaces after a sentence, I've seen that a few times now in comments. Do I have to follow that convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I believe the check for validity is to protect specifically again detaching already joined thread (which the testsuite does IIRC). So its probably worth continuing to mention why we might care specifically about a validity here, even though its undefined behaviour and we technically should not need to care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I understand now. Commit 7806443 brings the comment to its original state. I was misinterpreting that comment as "already-detached", which is checked below.
emscripten/system/lib/libc/musl/src/thread/pthread_detach.c
Lines 11 to 12 in 1e1761b
/* If the cas fails, detach state is either already-detached | |
* or exiting/exited, and pthread_join will trap or cleanup. */ |
Already joined threads are just threads that are invalid (non-existent).
tests/pthread/main_thread_join.cpp
Outdated
#else | ||
// Delay a bit to force the main thread to actually block, as exited | ||
// threads can be joined without blocking. | ||
emscripten_thread_sleep(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change actually needed? What happens if you don't make this change? Does the test fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I could remember, the test did indeed fail, because exited threads can be joined without any blocking now. Let me re-check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should be calling emscripten_check_blocking_allowed
even when a thread can be joined without blocking. The idea is to warn folks, even in the cases when they got lucky with the timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. the warning/error should still show even when joining already exited threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted that with commit 44776e8. PR description has been updated accordingly.
Ah, it looks like it can independently land now. I reverted the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for your patience on this one. I'm excited to see this land.
// does not exist anymore. | ||
if (t->self != t) return ESRCH; | ||
// Thread is attempting to join to itself. Already detached threads are | ||
// handled below by returning EINVAL instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention that these extra checks are because we want to pass the posixtest suite? IIRC both these cases count as undefined behaviour so are not strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the extra t->detach_state != DT_DETACHED
check? AFAIK, that was needed for our test_main_pthread_join_detach
and test_proxy_pthread_join_detach
tests, and not needed for the posix tests.
I could also remove this check and update that test exception by partially reverting commit ca90a4b, if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets leave that as a possible followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that the t->detach_state != DT_DETACHED
check is only needed for the other.test_{proxy,main}_pthread_join_detach
tests, see for e.g. work-in-progress commit 493f2e5 that updates its test asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to block is this PR on that change or should we co ahead and land with a TODO here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a TODO with commit 5e65f91, since removing this check would also mean that it causes an EDEADLK
error when already-detached threads are attempting to detach itself (as pthread_detach
re-uses code from pthread_join
).
898c613
to
5eade80
Compare
I opened issue #16136 to investigate the flakiness of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the change ready to go?
Regarding the flaky test.. does this change make it more flaky?
tests/pthread/main_thread_join.cpp
Outdated
@@ -33,7 +33,7 @@ void *ThreadMain(void *arg) { | |||
// succeeding. | |||
while (tries.load() < EXPECTED_TRIES) {} | |||
#endif | |||
pthread_exit((void*)0); | |||
pthread_exit((void*)0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this to keep this PR pure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weak_alias(__pthread_detach, emscripten_builtin_pthread_detach); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this file to keep this PR pure. Perhaps we can discuss/followup about using ifdef consistently in musl patches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5e65f91, I'll prepare a PR that unifies all these ifdef
's in musl.
// does not exist anymore. | ||
if (t->self != t) return ESRCH; | ||
// Thread is attempting to join to itself. Already detached threads are | ||
// handled below by returning EINVAL instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to block is this PR on that change or should we co ahead and land with a TODO here?
I think it's ready to land.
I'm not sure about this, I didn't fully investigate that flakiness. All tests succeeded prior to the rebase (see commit 898c613), let's see if it passes now. |
This broke the emscripten-releases roller as well as recent test runs on emscripten CI. I have no idea how it actually passed in #15604. That is rather worrying.
Split from PR #13007.