Skip to content

Move __cxa_thread_atexit to native code #14489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 4, 2021
Merged

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Jun 19, 2021

Use libc++abi's version of __cxa_thread_atexit.

Followup to #14484 and #14464 which both move more of the cleanup
handling for threads and processes onto the native side.

Depends on #15210.

@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch 2 times, most recently from e338f1f to a6a9edc Compare June 19, 2021 03:23
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch from a6a9edc to 11fbd46 Compare June 22, 2021 21:24
@sbc100 sbc100 requested a review from kleisauke June 22, 2021 21:33
@sbc100
Copy link
Collaborator Author

sbc100 commented Jun 22, 2021

Partially split out into #14512

Copy link
Collaborator

@kleisauke kleisauke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

}

static pthread_key_t key;
atomic_bool key_created = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a thread_local instead of an atomic bool.

Suggested change
atomic_bool key_created = false;
static thread_local bool key_created = false;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this seems to fix posixtest.test_pthread_exit_3_2 locally)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC pthread_key_create should only be called once from a single thread.. so this guard needs to be shared between thread, no?

Also see the comment above for why we specifically can't use TLS variables here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it should be shared across threads or if it should only be initialized once for a single thread. Given that the dtors_alive variable in libcxxabi was also thread local, I thought the latter approach was correct here (but I might be wrong).

Also see the comment above for why we specifically can't use TLS variables here.

Ah yes, I read that comment to fast. Could the C extension __thread help here? AFAIK, that doesn't trigger the run_dtors destructor in libcxxabi and would correspond to the dtors_alive variable.

// True if the destructors are currently scheduled to run on this thread
__thread bool dtors_alive = false;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically we are using pthread_key_create instead of the TLS variables used by emscripten/system/lib/libcxxabi/src/cxa_thread_atexit.cpp.

There are essentially two ways to achieve TLS: You can either use C/C++ __thread builtin, or you can do it at runtime with pthread_key_create. We are using the pthread_key_create method here because we cannot use any or C++ TLS storage here because the space needed by the TLS variables themselves is freed using a __cxa_thead_atexit handler.. even if the TLS is just plain old data.

When one uses the pthread_key_create one needs to make sure that pthread_key_create is only run once. Normally this is done using pthread_once, but that also can't be used in this case because it uses pthread_cleanup_push/pop which itself uses __cxa_thead_atexit. I've tried to explain more in the commands above.

For an example of how pthread_once is normally used to call pthread_key_create see: https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks for the clarification! fwiw, I tried to rebase this branch locally on origin/main, but it seems that posixtest.test_pthread_exit_3_2 still fails. Do you have any idea what could be causing this?

@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch from 3c9eb7b to 1bdace4 Compare June 23, 2021 14:21
@sbc100 sbc100 changed the base branch from main to main_thread_pthread_key_dtor June 23, 2021 14:21
@sbc100 sbc100 force-pushed the main_thread_pthread_key_dtor branch from 048326a to e8269a4 Compare June 23, 2021 14:21
Base automatically changed from main_thread_pthread_key_dtor to main June 23, 2021 15:43
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch 4 times, most recently from d47447c to eae80e0 Compare June 30, 2021 06:14
sbc100 added a commit that referenced this pull request Sep 21, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 21, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
sbc100 added a commit that referenced this pull request Sep 22, 2021
We were using `__cxa_thread_atexit` to register the funtion for free'ing
TLS data.  And because were were running the exit handlers
`__pthread_exit_run_handlers` before `__pthread_tsd_run_dtors` it meant
that any pthread_key descructor functions would not be able to access
TLS variables.

Instead we really need to delay free'ing of TLS data until all exit
handlers and pthread_key descructors have been run.

This was causing failure of `lsan.test_stdio_locking` due to the fact
that lsan uses TLS during pthread key destruction.  Specifically
`thread_finalize` (a pthread_key desctructor function) calls
`ThreadFinish` which in turn calls `GetCurrentThread` which uses a
thread local variable to store the current thread ID.

This change should allow #14489 to be dramatically simplified because
it will allow the use of TLS variables to implement the atexit queue.
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch from eae80e0 to 9f49acf Compare October 4, 2021 15:14
sbc100 added a commit that referenced this pull request Oct 4, 2021
These is no need to use `__cxa_thread_atexit` here, we can just call
these functions during thread exit (which is what musl does).

This also avoids references `__cxa_thread_atexit` which is normally a
libc++abi symbol from C programs.

Needed by (split out from) #14489.
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch from 9f49acf to 0fb0306 Compare October 4, 2021 15:50
@sbc100 sbc100 changed the base branch from main to avoid___cxa_thread_atexit October 4, 2021 15:51
@sbc100
Copy link
Collaborator Author

sbc100 commented Oct 4, 2021

OK this change is way nicer now. Since #15086 landed we can just use the libc++abi version of this function.

@sbc100 sbc100 requested a review from kleisauke October 4, 2021 15:52
sbc100 added a commit that referenced this pull request Oct 4, 2021
These is no need to use `__cxa_thread_atexit` here, we can just call
these functions during thread exit (which is what musl does).

This also avoids references `__cxa_thread_atexit` which is normally a
libc++abi symbol from C programs.

Needed by (split out from) #14489.
@sbc100 sbc100 force-pushed the avoid___cxa_thread_atexit branch from 1f479cf to 2cdcb71 Compare October 4, 2021 15:59
sbc100 added a commit that referenced this pull request Oct 4, 2021
These is no need to use `__cxa_thread_atexit` here, we can just call
these functions during thread exit (which is what musl does).

This also avoids references `__cxa_thread_atexit` which is normally a
libc++abi symbol from C programs.

Needed by (split out from) #14489.
@sbc100 sbc100 force-pushed the avoid___cxa_thread_atexit branch from 2cdcb71 to baac32a Compare October 4, 2021 17:15
Base automatically changed from avoid___cxa_thread_atexit to main October 4, 2021 18:09
sbc100 added a commit that referenced this pull request Oct 4, 2021
These is no need to use `__cxa_thread_atexit` here, we can just call
these functions during thread exit (which is what musl does).

This also avoids references `__cxa_thread_atexit` which is normally a
libc++abi symbol from C programs.

Needed by (split out from) #14489.
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch 3 times, most recently from 5af35f2 to 48d0d34 Compare October 4, 2021 18:18
Rather than using a separate JS array use pthread TLS key which get
cleaned during `__pthread_tsd_run_dtors`.

Followup to #14484 and #14464 which both move more of the cleanup
handling for threads and processes onto the native side.
@sbc100 sbc100 force-pushed the cxa_thread_atexit_native branch from 48d0d34 to 272673b Compare October 4, 2021 19:52
@sbc100 sbc100 requested review from dschuff and kripken October 4, 2021 21:46
@sbc100 sbc100 merged commit b55a3c3 into main Oct 4, 2021
@sbc100 sbc100 deleted the cxa_thread_atexit_native branch October 4, 2021 22:38
sbc100 added a commit that referenced this pull request Nov 15, 2021
Back in #14489 we started using the implemenation of
`__cxa_thread_atexit` from libcxxabi so this is stub should never be
needed.

This change also involves renaming the test from `.c` to `.cpp` since
C programs don't generally have access to `__cxa_thread_atexit` (its
part of libcxxabi).

I also removed the call to `__cxa_thread_atexit_impl` since we don't
define that anywhere in emscripten.

Also run this tests with threads enabled which would have resulted
in an undefined reference to `__cxa_thread_atexit_impl` previously.
sbc100 added a commit that referenced this pull request Nov 16, 2021
Back in #14489 we started using the implemenation of
`__cxa_thread_atexit` from libcxxabi so this is stub should never be
needed.

This change also involves renaming the test from `.c` to `.cpp` since
C programs don't generally have access to `__cxa_thread_atexit` (its
part of libcxxabi).

I also removed the call to `__cxa_thread_atexit_impl` since we don't
define that anywhere in emscripten.

Also run this tests with threads enabled which would have resulted
in an undefined reference to `__cxa_thread_atexit_impl` previously.
mmarczell-graphisoft pushed a commit to GRAPHISOFT/emscripten that referenced this pull request Jan 5, 2022
Back in emscripten-core#14489 we started using the implemenation of
`__cxa_thread_atexit` from libcxxabi so this is stub should never be
needed.

This change also involves renaming the test from `.c` to `.cpp` since
C programs don't generally have access to `__cxa_thread_atexit` (its
part of libcxxabi).

I also removed the call to `__cxa_thread_atexit_impl` since we don't
define that anywhere in emscripten.

Also run this tests with threads enabled which would have resulted
in an undefined reference to `__cxa_thread_atexit_impl` previously.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants