Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugSnag failed to notify on OOM with celery #372

Open
ay2456 opened this issue Jan 24, 2024 · 1 comment
Open

BugSnag failed to notify on OOM with celery #372

ay2456 opened this issue Jan 24, 2024 · 1 comment
Labels
backlog We hope to fix this feature/bug in the future feature request Request for a new feature

Comments

@ay2456
Copy link

ay2456 commented Jan 24, 2024

Describe the bug

I'm running BugSnag with celery on Kubernetes pods. I've noticed that when the pod is out of memory with signal 9 (SIGKILL), BugSnag can't report the error:

Error logs:

[2024-01-24 21:45:01,221: ERROR/MainProcess] Process 'ForkPoolWorker-4' pid:341 exited with 'signal 9 (SIGKILL)'
[2024-01-24 21:45:01,232: ERROR/MainProcess] Signal handler <function failure_handler at 0x7fd8291ae830> raised: AttributeError("'str' object has no attribute 'tb_frame'")
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 5.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/celery/utils/dispatch/signal.py", line 276, in send
    response = receiver(signal=self, sender=sender, **named)
  File "/opt/conda/lib/python3.10/site-packages/bugsnag/celery/__init__.py", line 14, in failure_handler
    bugsnag.auto_notify(exception, traceback=traceback,
  File "/opt/conda/lib/python3.10/site-packages/bugsnag/legacy.py", line 95, in auto_notify
    default_client.notify(
  File "/opt/conda/lib/python3.10/site-packages/bugsnag/client.py", line 84, in notify
    event = Event(
  File "/opt/conda/lib/python3.10/site-packages/bugsnag/event.py", line 107, in __init__
    stacktrace = self._generate_stacktrace(
  File "/opt/conda/lib/python3.10/site-packages/bugsnag/event.py", line 327, in _generate_stacktrace
    trace = traceback.extract_tb(tb)
  File "/opt/conda/lib/python3.10/traceback.py", line 72, in extract_tb
    return StackSummary.extract(walk_tb(tb), limit=limit)
  File "/opt/conda/lib/python3.10/traceback.py", line 364, in extract
    for f, lineno in frame_gen:
  File "/opt/conda/lib/python3.10/traceback.py", line 329, in walk_tb
    yield tb.tb_frame, tb.tb_lineno
AttributeError: 'str' object has no attribute 'tb_frame'
[2024-01-24 21:45:01,234: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 5.')
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 5.

Environment

  • Bugsnag version: 4.6.1
  • Python version: 3.10.12
  • Integration framework version:
    • Celery: 5.2.7
@clr182 clr182 added feature request Request for a new feature backlog We hope to fix this feature/bug in the future labels Feb 8, 2024
@clr182
Copy link

clr182 commented Feb 8, 2024

Hi @ay2456

Thanks for reaching out.

This is currently the expected behaviour as we have no mechanism in place to pre-allocate memory in the case of a Out of Memory exception nor do we currently have any error persistence in place.

I should note that we do not have an item on our backlog aimed at pre allocating this memory when the server starts so that when it does go down, it would still have some memory free to use and send the report.

We currently have no ETA on this functionality. Once we have an update on this we will be sure to share the additional information here.

Are you seeing this issue often? If so then it may be worth increasing the available memory for the pod as a temporary work around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog We hope to fix this feature/bug in the future feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants