feat: Add before crash hook to allow modifying other callbacks before crash #1366

mujacica · 2025-09-09T10:28:49Z

Add before crash hook to allow modifying other callbacks before crash
Hooks needed for game engines to prevent calling internal engine code while handling crashes
differs from on_crash_hook by being the first thing to get called when handling a crash (which is necessary to avoid calling SENTRY_X with an internal engine logger for our in-crash-handling log statements).

… crash

github-actions · 2025-09-09T10:29:15Z

	Messages
📖	Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by 🚫 dangerJS against f53c146

seer-by-sentry · 2025-09-11T07:26:59Z

src/backends/sentry_backend_inproc.c

+            options->before_crash_func(options->before_crash_data);
+        }


Potential bug: The before_crash_func is called from a signal handler before the SDK's signal-safety mechanisms are enabled via sentry__enter_signal_handler(), bypassing existing protections.

Description: The new before_crash_func is invoked from within a signal handler context in backends like inproc, breakpad, and crashpad. However, it is called before the SDK's own signal-safety mechanisms are enabled via sentry__enter_signal_handler(). This function is responsible for setting up spinlocks and a special page allocator to ensure subsequent operations are async-signal-safe. By calling the user-provided function before this setup, the SDK bypasses its own protections. This is inconsistent with existing callbacks like on_crash_func, which are called after the safety mechanisms are active. If a user's before_crash_func performs any non-async-signal-safe operations, it will likely lead to a deadlock or memory corruption, preventing the original crash from being reported.

Suggested fix: Move the invocation of before_crash_func to be after the call to sentry__enter_signal_handler() in all relevant backends (sentry_backend_inproc.c, sentry_backend_breakpad.cpp, and sentry_backend_crashpad.cpp). This ensures the SDK's signal-safety infrastructure is active before executing user-provided code, consistent with other crash callbacks.
_{severity: 0.85, confidence: 0.9}

_{Did we get this right? 👍 / 👎 to inform future reviews.}

This reverts commit b0ed4fb.

This reverts commit 38f71ca.

supervacuus

Thanks @mujacica. I am sorry to be that guy, because you've already detailed the prose quite a bit, but I am still not clear on the purpose of this hook.

What do you mean by "allow modifying other callbacks before crash"?
How does this new hook "prevent calling internal engine code while handling crashes" where the on_crash/before_send hooks fail to do so?

If this:

which is necessary to avoid calling SENTRY_X with an internal engine logger for our in-crash-handling log statements

is the core issue (perhaps also addressed in point 2 above), I propose an alternative approach that is orthogonal to the hooks and more broadly applicable:

expose an option sentry_options_set_handler_logging_enabled (or whatever), which signals whether logging should be active while the crash handler is active (i.e., while we process a crash)
introduce a new (atomic) global in the logger module logging_enabled (or ...)
set this global via a sentry__logger_enable()/sentry__logger_disable() first and last thing from every crash handler if the option says the logger should be disabled

There are multiple use cases where you want to prevent a client-provided logger function from being invoked when a crash is in progress. A dynamic logger-enable/disable mechanism could be helpful in other parts of the code.

Of course, if it goes beyond the logging callbacks, that won't suffice.

mujacica · 2025-09-11T16:12:51Z

Hey @supervacuus, thanks for a thorough review!

I agree regarding having dynamic logger enable/disable (and I don't have a problem extending the PR to include it).

However, we also need a way, in downstream SDKs, to adapt the behavior during crash handling. Two examples (that I am working on supporting):

Engine memory allocator is corrupted - any memory allocations during Sentry event processing (Logging, Attachments, basically any memory allocation using engine data types FString etc.) would cause Sentry to crash too
Engine stack overflow - trying to allocate anything further on the stack while processing event would cause Sentry to crash too (not as bad as 1, but still needs different behavior)

Reason why console logs are mentioned here is because the console logs from the native backends always caused secondary crashes, while the other potential issues are a little bit more hidden inside of downstream SDKs (e.g. before_send handling).

If you have an idea on how to support both use-cases (logging vs. rest), I am happy to implement it. A possibility would be to:

Use the proposal you made for enable/disable logging
Ensure that no callbacks are being triggered between start of crash handling and on_crash callback call
Use the on_crash callback to "notify" downstream SDKs about crash being processed

vaind · 2025-09-15T10:11:47Z

I also think the logging part could be handled as a configuration, other than the reasons above, also because it's cleaner to know what's happening and that this must be checked first in the handler to configure logging. BTW, in sentry-playstation, I've done something similar WRT to logging - the handler replaces the logger: sentry__logger_set_global({ nullptr, nullptr, SENTRY_LEVEL_ERROR });

The logging changes could be done in a separate PR

mujacica · 2025-09-15T10:32:01Z

I will start working on logging changes in separate PR, and we can continue discussion here on how to handle notifying downstream SDKs about crash handling.

supervacuus · 2025-09-15T11:14:44Z

Maybe as a premise before my response to your examples:

Having an early hook exposed for all handlers we provide can be a good idea. I would probably call it early or even unsafe-on_crash rather than before crash, to clearly signal that it still happens inside our handlers but without any synchronization with SDK internals, thus making interactions with the SDK unsafe (but for your use-cases, that is the idea since you don't need safety wrt the SDK).

It is also not the same as having a hook into the raw OS-level handlers because a lot of the backend machinery already happened when the handler of the Native SDK runs (for instance, breakpad already gathered the snapshot of the crashed application, and not all threads might be correctly running anymore).

Therefore, I wanted to narrow down the potential uses downstream, where applying changes here makes more sense (as opposed to downstream in a hook that a native provides). Of course, this was also the case because I was not aware of any downstream drivers.

However, we also need a way, in downstream SDKs, to adapt the behavior during crash handling. Two examples (that I am working on supporting):

Engine memory allocator is corrupted - any memory allocations during Sentry event processing (Logging, Attachments, basically any memory allocation using engine data types FString etc.) would cause Sentry to crash too

But the Native SDK itself does not allocate using any specific outside allocator (however, we do switch to a page-allocator when POSIX signal handling is involved). So any FString usage should be confined to either of the two existing crash hooks or the already mentioned logger callback.

So on_crash should be early enough for any rectification, considering logging is disabled during the handler execution, so as not to trigger any calls into the Unreal allocator. Am I getting this right, or am I missing something?

Attachments should be entirely shielded from allocations of the Unreal allocator as well, even when you pass in buffers from engine types, since we copy all incoming buffers into memory that we internally allocated.

It also depends on what the cure for a corrupted engine memory allocator would be and how you identify this. Would you check the crash context and set a flag that prevents the other hooks from relying on any allocation triggering code (or do nothing)? Then I think the on_crash hook, under the premise of not calling into the engine allocator in the code path before, as the logger would, would still be early enough?

Engine stack overflow - trying to allocate anything further on the stack while processing event would cause Sentry to crash too (not as bad as 1, but still needs different behavior)

We already introduced additional handling of stack overflows into the Native SDK (sigaltstack on POSIX systems and Windows stack guarantees). If upstreaming is technically and legally feasible, additional handling is more than welcome in the Native SDK. I am not sure how an additional early hook would prevent any issues in the stack overflow case, though, since

a) it itself would require stack space to execute
b) our handler is called from backend code, which also requires stack space to execute, so at the point your early handler gets invoked, we are relatively late in the handler stack (even if early/first in our callback)

Reason why console logs are mentioned here is because the console logs from the native backends always caused secondary crashes, while the other potential issues are a little bit more hidden inside of downstream SDKs (e.g. before_send handling).

If you have an idea on how to support both use-cases (logging vs. rest), I am happy to implement it. A possibility would be to:

Use the proposal you made for enable/disable logging

Ensure that no callbacks are being triggered between start of crash handling and on_crash callback call

I am not aware of any other callbacks (besides the logger) that would be invoked before the on_crash callback gets invoked. But if there are, then we should handle them similarly to the logging callback.

Use the on_crash callback to "notify" downstream SDKs about crash being processed

That would be its intended use. However, I am very aware that we have some quite intertwined machinery that can result in the issues you are seeing. It is worth it to find a solution for each issue before we introduce a very broad workaround (even if we end up with that in the end).

From my POV, there are three ways to go forward (not at all mutually exclusive):

expose the hook (with sufficient documentation in the header docs that highlights the expected interaction profile, lack of guarantees, and limitations mentioned above). Since this is a public API, it should also clarify to users that they most likely want to use on_crash.
expose options to disable the logger once the crash handler is active (+ identify other active hooks that could conflict during the handler, and thus could be disabled, but I am unaware of any)
upstream specific infrastructure that makes sense for a broader audience (an additional mechanism for handling stack usage during crash, configuration of external allocators, opening the crash allocator).

supervacuus · 2025-09-15T11:17:10Z

I also think the logging part could be handled as a configuration, other than the reasons above, also because it's cleaner to know what's happening and that this must be checked first in the handler to configure logging. BTW, in sentry-playstation, I've done something similar WRT to logging - the handler replaces the logger: sentry__logger_set_global({ nullptr, nullptr, SENTRY_LEVEL_ERROR });

That is also a perfectly valid solution to disable the logger.

The logging changes could be done in a separate PR

Yes, the logging changes are orthogonal

feat: Add before crash hook to allow modifying other callbacks before…

4bb7111

… crash

mujacica added 6 commits September 9, 2025 12:30

Update changelog

b878cb0

Add unit test

9ed2bea

Simplify the test

6e92d72

Refactor the test, update CMakeLists

09aaa2f

Refactor vol 3

13749ed

Fix user_data

5af77a2

mujacica marked this pull request as ready for review September 11, 2025 07:21

mujacica requested review from JoshuaMoelans and supervacuus September 11, 2025 07:21

seer-by-sentry bot reviewed Sep 11, 2025

View reviewed changes

mujacica added 5 commits September 11, 2025 09:30

Fix type conversion

38f71ca

Fix includes

b0ed4fb

Revert test changes

4a99e82

This reverts commit b0ed4fb.

Revert "Fix type conversion"

bb43805

This reverts commit 38f71ca.

Update pointers

f53c146

supervacuus reviewed Sep 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Add before crash hook to allow modifying other callbacks before crash #1366

feat: Add before crash hook to allow modifying other callbacks before crash #1366

Uh oh!

mujacica commented Sep 9, 2025 •

edited by JoshuaMoelans

Loading

Uh oh!

github-actions bot commented Sep 9, 2025 •

edited

Loading

Uh oh!

seer-by-sentry bot Sep 11, 2025

Uh oh!

supervacuus left a comment

Uh oh!

mujacica commented Sep 11, 2025

Uh oh!

vaind commented Sep 15, 2025 •

edited

Loading

Uh oh!

mujacica commented Sep 15, 2025

Uh oh!

supervacuus commented Sep 15, 2025 •

edited

Loading

Uh oh!

supervacuus commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

feat: Add before crash hook to allow modifying other callbacks before crash #1366

Are you sure you want to change the base?

feat: Add before crash hook to allow modifying other callbacks before crash #1366

Uh oh!

Conversation

mujacica commented Sep 9, 2025 • edited by JoshuaMoelans Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seer-by-sentry bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

supervacuus left a comment

Choose a reason for hiding this comment

Uh oh!

mujacica commented Sep 11, 2025

Uh oh!

vaind commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mujacica commented Sep 15, 2025

Uh oh!

supervacuus commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supervacuus commented Sep 15, 2025

Uh oh!

Uh oh!

mujacica commented Sep 9, 2025 •

edited by JoshuaMoelans

Loading

github-actions bot commented Sep 9, 2025 •

edited

Loading

vaind commented Sep 15, 2025 •

edited

Loading

supervacuus commented Sep 15, 2025 •

edited

Loading