Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending many signals to a Python process crashes the interpreter #118143

Open
vanschelven opened this issue Apr 22, 2024 · 9 comments
Open

Sending many signals to a Python process crashes the interpreter #118143

vanschelven opened this issue Apr 22, 2024 · 9 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@vanschelven
Copy link

vanschelven commented Apr 22, 2024

Bug report

Bug description:

Consider the following simple program (test-breakage.py):

from time import sleep
import os
import signal


class Foreman:
    def __init__(self):
        signal.signal(signal.SIGUSR1, self.handle_sigusr1)
        pid = os.getpid()
        print("Test script created, my pid is %s", pid)
        with open("/tmp/test-breakage.pid", "w") as f:
            f.write(str(pid))

    def handle_sigusr1(self, sig, frame):
        print("handle_sigusr1")

    def run_forever(self):
        sleep(3600)


Foreman().run_forever()

Now, a tool to send "a lot" of signal to it:

import os
import sys
import signal


pid = int(sys.argv[1])
iterations = int(sys.argv[2])

for i in range(iterations):
    os.kill(pid, signal.SIGUSR1)

Or, if you want the signals to be sent more quickly, implemented in c:

#include <stdlib.h>
#include <signal.h>


int main(int argc, char*argv[]) {
    long pid, iterations;
    if (argc > 2) {
        pid = strtol(argv[1], NULL, 0);
        iterations = strtol(argv[2], NULL, 0);

        for (int i = 0; i < iterations; i++) {
            kill(pid, SIGUSR1);
        }
   }
   return 0;
}

Running these scripts with "sufficiently high" numbers results in various crashes in test-breakage

  • Exit with "User defined signal 1" (shouldn't happen, we've defined a handler)
  • RuntimeError without any further information
  • Exit without any printed string
  • RuntimeError: reentrant call inside <_io.BufferedWriter name=''>

(In all cases, after repeating "handle_sigusr1" a number of times first)

tested on Linux w. Python 3.10

CPython versions tested on:

3.10

Operating systems tested on:

Linux

@vanschelven vanschelven added the type-bug An unexpected behavior, bug, or error label Apr 22, 2024
@terryjreedy
Copy link
Member

What do you realistically expect to happen? Which behaviors do you think indicate fixable bugs in Python? What possible responses would not be a bug?

@vanschelven
Copy link
Author

I'm no expert on signals, so my expectations may be weird ("unrealisitic").

Based on the various manuals (not least of which Python's own), I would expect them to be something processes can send to each other to signal something 😄 That such signals may be lost if many are sent is part of my expectations, and is documented for e.g. the Linux platform.

However, I do not expect signals to be means of crashing other processes if this is not the explicit goal of the signal (i.e. SIGKILL killing a process is expected).

Especially given the fact that the Python manual documents the existence of a Python signal handler, I would expect setting up such a handler would make it so that signals are handled by it. Even if there are many. Not handling the signal (and crashing) is a bug (given my current set of expectations) then.

I am not sure whether this is is a "fixable" bug in Python. Let's first determine whether it is a bug at al 😄

@vanschelven
Copy link
Author

Another way of saying this is: I expect the program to behave as it behaves when sending "just a few" signals. This to me is an indication that this is a bug: when the load increases, the program fails. In a way that is not predicted by the documentation.

@terryjreedy
Copy link
Member

@gpshead signals issue

@vanschelven
Copy link
Author

Curiously, the small C program I wrote to handle "many" of these signals is also not able to deal with them successfully:

#include <signal.h>
#include <stdlib.h>
#include <stdio.h>

static void sig_handler(int _)
{
    (void)_;
    puts("Received signal, doing nothing...");
}

int main(void)
{
    puts("Running");
    signal(SIGUSR1, sig_handler);

    while (1)
        puts("Still running...");

    return EXIT_SUCCESS;
}

(based on https://stackoverflow.com/questions/4217037/catch-ctrl-c-in-c)

though this is somewhat more stable on my system than the Python version, bombarding it with many signals I am able to let this program "hang" (show no more output and use no more CPU) though not crash. Not sure what to make of this though.

@vanschelven
Copy link
Author

But taking puts (which is presumably not reentrant, I'm not a C programmer) out of the handler, I am indeed able to write a program that runs as expected on my system:

#include <signal.h>
#include <stdlib.h>
#include <stdio.h>

static volatile int seen_signal = 0;

static void sig_handler(int _)
{
    (void)_;
    seen_signal += 1;
}

int main(void)
{
    puts("Running");
    signal(SIGUSR1, sig_handler);

    while (1)
        printf("Signals received: %d\n", seen_signal);

    return EXIT_SUCCESS;
}

So I'm chalking this one up to Python after all

@gpshead
Copy link
Member

gpshead commented Apr 22, 2024

from C you cannot do IO in a signal handler. In pure python code we run the signal handler code later outside of C signal handler context, thus it may technically be able to do more at times, but I'd still rank "doing IO" even in a python signal handler (which is effectively possible to be executed between any given bytecode and between some operations using the C API that check the internal signal context) as a bad idea.

Regardless, if we can prevent 'Exit with "User defined signal 1" (shouldn't happen, we've defined a handler)' from happening that'd be good.

If this can be reproduced in a simple handler that does no IO that'd be much more interesting.

As for why this may not be a significant problem as is in practice: Signals are generally not frequent.

@vanschelven
Copy link
Author

I have indeed not been able to reproduce the problem when I adapted the signal handler like so:

import os
import signal


i = 0


class Foreman:
    def __init__(self):
        signal.signal(signal.SIGUSR1, self.handle_sigusr1)
        pid = os.getpid()
        print("Test script created, my pid is %s", pid)
        with open("/tmp/test-breakage.pid", "w") as f:
            f.write(str(pid))

    def handle_sigusr1(self, sig, frame):
        global i
        i += 1

    def run_forever(self):
        while True:
            print("I have seen", i, "signals")


Foreman().run_forever()

This would make me say that this is indeed not a (Python) bug.

@vanschelven
Copy link
Author

but I'd still rank "doing IO" even in a python signal handler [..] as a bad idea.

I have recently come to agree with this pov, but perhaps the documentation should be clarified somewhat. Example of a serious project that does quite some IO in its signal handlers:

https://github.com/benoitc/gunicorn/blob/master/gunicorn/arbiter.py#L252 (and lines below)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants