Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable coredump for containerized applications #419

Merged
merged 1 commit into from
Nov 22, 2021

Conversation

hoditohod
Copy link
Contributor

Overview

This PR contains the proposed fix for issue #418.

It unblocks the fault signal handler after the log/stacktrace has been flushed. The mechanism is reused from the windows crash handler where this unblocking was already implemented. The modification does not affect non-containerized native processes (non-PID1) as the final step after the flush is re-emitting the fatal signal (kill) and code execution for the process stops here. For containerized processes the kill is passed and ignored and unblocking the sleep in the fault signal handler takes place. This enables the containerized process to exit and create a core-dump.

There are no testing or documentation activities done.

  • TDD

New/modified code must be backed down with unit test - preferably TDD style development)

  • Documentation

All new/modified functionality should be backed up with API documentation (API.markdown or README.markdown)

Cross-Platform Testing

  • Travis-CI (Linux, OSX) + AppVeyor-CI (Windows)\
  • Optional: Local/VM testing: Windows
  • Optional: Local/VM testing: OSX
  • Optional: Local/VM testing: Linux

Testing Advice

mkdir build; cd build; cmake -DADD_G3LOG_UNIT_TEST=ON ..

Run Test Alternatives:

  • Cross-Platform: ctest
  • or ctest -V for verbose output
  • Linux: make test

// When running as PID1 the above kill doesn't have any effect (execution simply passes through it, contrary
// to a non-PID1 process where execution stops at kill and switches over to signal handling). Also as PID1
// we must unblock the thread that received the original signal otherwise the process will never terminate.
gBlockForFatal = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comments that explain this small but non-trivial system
insight

@KjellKod KjellKod merged commit c51128f into KjellKod:master Nov 22, 2021
GergoTot added a commit to GergoTot/g3log that referenced this pull request Mar 3, 2023
…er with PID 1 aborted

Our service (running in Docker and PID 1) was crashed with SIGABRT signal. After SIGABRT dropped then unfortunatlly infinite SIGSEGV signals were also started to drop. So the infinite loop stucked since the kill signal doesn't stop the infinite loop when running in Docker container with PID 1. We used the similar solution mentioned this PR: KjellKod#419. We also had to restore the saved signal handlers. Without it infinte SIGSEGV signals were dropped circully and this situation also caused pending when running in Docker container with PID 1.
@GergoTot GergoTot mentioned this pull request Mar 3, 2023
GergoTot added a commit to GergoTot/g3log that referenced this pull request Mar 3, 2023
…ith PID 1 aborted

Our service (running in Docker and PID 1) was crashed with SIGABRT signal. After SIGABRT dropped then unfortunatlly infinite SIGSEGV signals were also started to drop. So the infinite loop stucked since the kill signal doesn't stop the infinite loop when running in Docker container with PID 1. We used the similar solution mentioned this PR: KjellKod#419. We also had to restore the saved signal handlers. Without it infinte SIGSEGV signals were dropped circully and this situation also caused pending when running in Docker container with PID 1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants