-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Port kleisauke's pthreads+printf fix #13837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -24,12 +25,28 @@ EM_JS(void, initPthreadsJS, (void), { | |||
PThread.initRuntime(); | |||
}) | |||
|
|||
static void init_file_lock(FILE *f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment along that lines of "This code is copied from musl/src/thread/pthread_create.c. Keep in sync"
Same below where init_file_lock is called.
#include <stdlib.h> | ||
#include "../../system/lib/libc/musl/include/features.h" | ||
#include "../../system/lib/libc/musl/src/internal/stdio_impl.h" | ||
#include "../../system/lib/libc/musl/src/internal/libc.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC including these musl-internal headers from code other than musl-internal code is not supported an no longer works with the musl upgrade I'm looking at landing soon.
Any way we can avoid this? I guess not.. I can look into a workaround when I next loop back around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be avoided by removing those printf
statements at the beginning of main()
, not sure if those are useful.
Another testcase that appears to do this is tests/pthread/test_pthread_locale.c
:
#include "../../system/lib/libc/musl/src/internal/pthread_impl.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Test suite probably fails due to a possible deadlock in test_pthread_exit_process
, which can be avoided by including commit ffe2696.
printf("stdin lock: %d\n", stderr->lock); | ||
printf("stdout lock: %d\n", stdin->lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
printf("stdin lock: %d\n", stderr->lock); | |
printf("stdout lock: %d\n", stdin->lock); | |
printf("stdin lock: %d\n", stdin->lock); | |
printf("stdout lock: %d\n", stdout->lock); |
Reading ffe2696 , I'm not sure I understand it. Why does it say (It's only the syscalls that are proxied to the main thread, and so the main thread is special there. But If that commit prevents a deadlock, then that seems to confirm that another thread is holding the lock, and that there is a possible race? |
Ah yes, you're right. I probably got confused, with that I meant to check if the current "process" is single-threaded (i.e. when all threads have exited), similar to this musl check: The Details
Interpreting that, it looks like the file lock is still held by the proxied thread when the main thread is shutdown. I could workaround this by inheriting the |
Interesting, thanks @kleisauke - looks like what's happening in your traces is that I don't think we can share the tid between the main runtime thread and the proxied main. The proxied main is just a normal pthread, and it and the main runtime thread need to lock if they access the same things.
A deadlock seems to indicate more than one lock being involved, then. Like one thread holds a lock to stdout and one to stderr, or some other lock. Does the new code that is enabled by |
Indeed, sharing the tid between the main thread and the proxied main is not a good solution. I could resolve the above mentioned test hang with commit 8c1fd43. This is probably because the proxied main thread is shutdown while still holding the lock, causing the Cherry-picking that commit on this PR doesn't resolve the test hang in |
Thanks @kleisauke ! What do we think of this PR, given that information? Is there still value in cherry-picking those changes before the full musl upgrade? That is, originally this PR was meant to be a small forward port, but since we need to forward port multiple things, this feels less and less clean and more and more risky to me... |
I just opened PR #14481 to resolve the test hang. Indeed, it feels a bit risky to cherry-pick those commits, so it's probably better to wait for the full musl upgrade. |
Thanks @kleisauke , that makes sense. Closing this then. |
This is second attempt at landing a version of d5d5f69 The first time we tried it #13837 we ran into issues with test_pthread_exit_process deadlocking which I tracked down to an issue with `system/lib/libc/musl/src/thread/__wait.c` where it was blocking the main thread forever rather then looping and calling `emscripten_main_thread_process_queued_calls`. Fixes: #13194
This is second attempt at landing a version of d5d5f69 The first time we tried it #13837 we ran into issues with test_pthread_exit_process deadlocking which I tracked down to an issue with `system/lib/libc/musl/src/thread/__wait.c` where it was blocking the main thread forever rather then looping and calling `emscripten_main_thread_process_queued_calls`. Includes a version of #14481 so that should land before this does. Fixes: #13194
I tracked down the deadlock to an issues with |
Great catch! I think that was probably the reason why musl commit |
This is second attempt at landing a version of d5d5f69 The first time we tried it #13837 we ran into issues with test_pthread_exit_process deadlocking which I tracked down to an issue with `system/lib/libc/musl/src/thread/__wait.c` where it was blocking the main thread forever rather then looping and calling `emscripten_main_thread_process_queued_calls`. Includes a version of #14481 so that should land before this does. Fixes: #13194
This is second attempt at landing a version of d5d5f69 The first time we tried it #13837 we ran into issues with test_pthread_exit_process deadlocking which I tracked down to an issue with `system/lib/libc/musl/src/thread/__wait.c` where it was blocking the main thread forever rather then looping and calling `emscripten_main_thread_process_queued_calls`. Includes a version of #14481 so that should land before this does. Fixes: #13194
This is second attempt at landing a version of d5d5f69 The first time we tried it #13837 we ran into issues with test_pthread_exit_process deadlocking which I tracked down to an issue with `system/lib/libc/musl/src/thread/__wait.c` where it was blocking the main thread forever rather then looping and calling `emscripten_main_thread_process_queued_calls`. Includes a version of #14481 so that should land before this does. Fixes: #13194
See d5d5f69 and #13194 (comment)
The key fixes are to initialize multithreading in musl properly, and to
also build
libc_rt_wasm
with pthreads support (for the latter, we couldalso move the locking code out of that library and into libc, but this
is safer).
Test is verified as failing before, and passing with this change, both
in the musl internals printed and also with high probability the test
shows interleaved output without the fix, but never with it, in my
testing.
Fixes #13194