Skip to content

Remove special case JS function proxying from library_pthread.c #14912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Aug 19, 2021

The normal __proxy method for JS functions works fine in these cases
and avoiding the special cases reduces the dependies of
library_pthread.c which is included in all threaded programs.

@sbc100 sbc100 force-pushed the remove_special_case_proxying branch from 45f4468 to de7c458 Compare August 19, 2021 18:36
The normal `__proxy` method for JS functions works fine in these cases
and avoiding the special cases reduces the dependies of
`library_pthread.c` which is included in all threaded programs.
@juj
Copy link
Collaborator

juj commented Aug 23, 2021

Why call the builtin_malloc directly and not malloc? Doing that prevents people from providing their own malloc wrappers/implementations?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 23, 2021

Why call the builtin_malloc directly and not malloc? Doing that prevents people from providing their own malloc wrappers/implementations?

The the _builtin aliases are already used in several places, so I don't think this PR changes any requirements. These function (aliases normally) are used in asan/lsan builds when we want to make allocations that are not tracked by the santizers. IIRC, if you want to replace malloc/free in emscripten today, you already need to provide these aliases (we had to add them to both dlmalloc and emmalloc for example).

Maybe we could provide default aliases that are used when the santizers are not enabled, that way folks would only need to include them in their custom allocator if they also want to use the santizers?

@juj
Copy link
Collaborator

juj commented Aug 24, 2021

These function (aliases normally) are used in asan/lsan builds when we want to make allocations that are not tracked by the santizers. IIRC, if you want to replace malloc/free in emscripten today, you already need to provide these aliases (we had to add them to both dlmalloc and emmalloc for example).

Hmm, it looks like there has been an overloaded meaning added to the emscripten_builtin_malloc symbol names.

Originally those _builtin_ names were added so that end users could create a frontend/passthrough to malloc/free e.g. for their own memory allocation tracking purposes, but still keep linking in the original malloc & free. See #4603.

This way end users could implement malloc() and free() in their code, and then from those functions do some memory alloc tracking, and then call to emscripten_builtin_malloc() and emscripten_builtin_free() to use the underlying dlmalloc/emmalloc.

However when asan & lsan support was added, it looks like they have "stolen" this use case from the user, and now malloc() means asan-tracked call, and emscripten_builtin_malloc() means a non-asan tracked call.

Though even with this mutated meaning, it is not obvious to me why these calls would want to bypass asan&lsan? These all look like general memory allocation that would benefit from being taken through memory sanitizers as well?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 24, 2021

These function (aliases normally) are used in asan/lsan builds when we want to make allocations that are not tracked by the santizers. IIRC, if you want to replace malloc/free in emscripten today, you already need to provide these aliases (we had to add them to both dlmalloc and emmalloc for example).

Hmm, it looks like there has been an overloaded meaning added to the emscripten_builtin_malloc symbol names.

Originally those _builtin_ names were added so that end users could create a frontend/passthrough to malloc/free e.g. for their own memory allocation tracking purposes, but still keep linking in the original malloc & free. See #4603.

This way end users could implement malloc() and free() in their code, and then from those functions do some memory alloc tracking, and then call to emscripten_builtin_malloc() and emscripten_builtin_free() to use the underlying dlmalloc/emmalloc.

However when asan & lsan support was added, it looks like they have "stolen" this use case from the user, and now malloc() means asan-tracked call, and emscripten_builtin_malloc() means a non-asan tracked call.

Though even with this mutated meaning, it is not obvious to me why these calls would want to bypass asan&lsan? These all look like general memory allocation that would benefit from being taken through memory sanitizers as well?

Isn't asan using the _builtin_ function exactly as you describe. It wants to wrap malloc and then call the underlying version, just as you describe. The only potential problem here is that there can only be a single interceptor/wrapper installed at a given time and so asan is not compatible with other layers of wrapping/intercepting being active at the same time.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 24, 2021

I think these two comments (one from compiler-rt and one from our code) describe why we don't/can't detect leaks in the low level pthread code:

{
// Ignore all allocations made by pthread_create: thread stack/TLS may be
// stored by pthread for future reuse even after thread destruction, and
// the linked list it's stored in doesn't even hold valid pointers to the
// objects, the latter are calculated by obscure pointer arithmetic.
#if CAN_SANITIZE_LEAKS
__lsan::ScopedInterceptorDisabler disabler;
#endif
result = REAL(pthread_create)(thread, attr, asan_thread_start, t);
}

#ifdef HAS_SANITIZER
// ASan wraps the emscripten_builtin_pthread_create call in __lsan::ScopedInterceptorDisabler.
// Unfortunately, that only disables it on the thread that made the call.
// This is sufficient on the main thread.
// On non-main threads, pthread_create gets proxied to the main thread, where LSan is not
// disabled. This makes it necessary for us to disable LSan here, so that it does not detect
// pthread's internal allocations as leaks.
// If/when we remove all the allocations from __pthread_create_js we could also remove this.
__lsan_disable();
#endif

@juj
Copy link
Collaborator

juj commented Aug 26, 2021

Isn't asan using the _builtin_ function exactly as you describe. It wants to wrap malloc and then call the underlying version, just as you describe.

Yes it is, but by doing that, it prevents the user from doing that. So asan & lsan are incompatible with users wrapping.

The only potential problem here is that there can only be a single interceptor/wrapper installed at a given time and so asan is not compatible with other layers of wrapping/intercepting being active at the same time.

We can work around that problem by saying "asan/lsan" is not available for codebases that wrap to malloc/free on their own.

But the bigger problem that results from this is that all these call sites that bypass regular calls to malloc/free to appease asan/lsan now also bypass calls to user-provided malloc & free. As result, it means that user-provided malloc & free cannot be wrapped when building with multithreading enabled, even if asan/lsan are disabled.

I think these two comments (one from compiler-rt and one from our code) describe why we don't/can't detect leaks in the low level pthread code:

Hmm, I am not sure I understand the comments in those blocks. It seems that they are describing a cleanup issue rather than a tracking issue(?) E.g. since the worker lingers around in a pool, it might keep the thread stack alive? Although that sounds like a bug - the pthread stack should really be freed when a pthread quits, even if the worker is kept alive in the pool.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 26, 2021

Isn't asan using the _builtin_ function exactly as you describe. It wants to wrap malloc and then call the underlying version, just as you describe.

Yes it is, but by doing that, it prevents the user from doing that. So asan & lsan are incompatible with users wrapping.

The only potential problem here is that there can only be a single interceptor/wrapper installed at a given time and so asan is not compatible with other layers of wrapping/intercepting being active at the same time.

We can work around that problem by saying "asan/lsan" is not available for codebases that wrap to malloc/free on their own.

But the bigger problem that results from this is that all these call sites that bypass regular calls to malloc/free to appease asan/lsan now also bypass calls to user-provided malloc & free. As result, it means that user-provided malloc & free cannot be wrapped when building with multithreading enabled, even if asan/lsan are disabled.

I think these two comments (one from compiler-rt and one from our code) describe why we don't/can't detect leaks in the low level pthread code:

Hmm, I am not sure I understand the comments in those blocks. It seems that they are describing a cleanup issue rather than a tracking issue(?) E.g. since the worker lingers around in a pool, it might keep the thread stack alive? Although that sounds like a bug - the pthread stack should really be freed when a pthread quits, even if the worker is kept alive in the pool.

I don't fully understand the details either but it seems clear the asan/lsan are not expecting to be able to track memory leaks that occur within pthread_create itself.

Its also the case the emscripten already goes to some lengths to avoid this too. See the existing code we have today in emscripten:

#ifdef HAS_SANITIZER
// ASan wraps the emscripten_builtin_pthread_create call in __lsan::ScopedInterceptorDisabler.
// Unfortunately, that only disables it on the thread that made the call.
// This is sufficient on the main thread.
// On non-main threads, pthread_create gets proxied to the main thread, where LSan is not
// disabled. This makes it necessary for us to disable LSan here, so that it does not detect
// pthread's internal allocations as leaks.
// If/when we remove all the allocations from __pthread_create_js we could also remove this.
__lsan_disable();
#endif
q->returnValue.i =
__pthread_create_js(q->args[0].vp, q->args[1].vp, q->args[2].vp, q->args[3].vp);
#ifdef HAS_SANITIZER
__lsan_enable();
#endif

I don't think this PR is ready to land yet at least now without more research...

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 26, 2021

But the bigger problem that results from this is that all these call sites that bypass regular calls to malloc/free to appease asan/lsan now also bypass calls to user-provided malloc & free. As result, it means that user-provided malloc & free cannot be wrapped when building with multithreading enabled, even if asan/lsan are disabled.

I don't think its true that it prevents wrapping. What it means is that the wrapper interceptors won't see those particular allocations. Wrapping is still possible, and all the normal allocations/frees will still go through the wrappers.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 26, 2021

But the bigger problem that results from this is that all these call sites that bypass regular calls to malloc/free to appease asan/lsan now also bypass calls to user-provided malloc & free. As result, it means that user-provided malloc & free cannot be wrapped when building with multithreading enabled, even if asan/lsan are disabled.

I don't think its true that it prevents wrapping. What it means is that the wrapper interceptors won't see those particular allocations. Wrapping is still possible, and all the normal allocations/frees will still go through the wrappers.

As an alternative, if you think its worth it, we could invent a new wrapping mechanism specifically for the sanitizers. But I think this would make wrapping more complicated.

@stale
Copy link

stale bot commented Sep 21, 2022

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 21, 2022
@sbc100 sbc100 closed this Jul 13, 2024
@sbc100 sbc100 deleted the remove_special_case_proxying branch July 13, 2024 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants