Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProcessPoolExecutor worker processes stay alive after parent process is killed #111873

Open
zpincus opened this issue Nov 9, 2023 · 3 comments
Open
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@zpincus
Copy link

zpincus commented Nov 9, 2023

Bug report

Bug description:

If a Python process using concurrent.futures.ProcessPoolExecutor dies in a way that prevents cleanup (e.g. a kill signal), the child processes do not exit. In the case of a forking multiprocessing context, this can lead to significant resource leaks (as the forked children inherit the parent resources but do not exit).

Here is a minimal reproduction:

import multiprocessing
import os
import time
from concurrent import futures

def task(i):
    print('Task', i)
    time.sleep(1)

def queue_and_die():
    print(pid := os.getpid())
    with futures.ProcessPoolExecutor(max_workers=4) as executor:
        for i in range(20):
            executor.submit(task, i)
        time.sleep(1)
        os.kill(pid, 9)

if __name__ == '__main__':
    import sys
    multiprocessing.set_start_method(sys.argv[1])
    queue_and_die()

Then in the terminal, run as e.g. python test.py fork (or use spawn or forkserver; all give the same result) and note the pid on the first line of output. After the parent process dies, use e.g. pgrep -ag <PID> to observe that the worker processes (with the process group ID corresponding to the PID of the parent process) are still alive. kill -9 -<PID> will clean these right up.

This behavior has been noted a few places; e.g. https://stackoverflow.com/questions/71300294/how-to-terminate-pythons-processpoolexecutor-when-parent-process-dies. However, the suggested solution (a thread in the worker process that polls to see if the parent PID is still alive) could fail if the parent PID gets reused. (IIUC the likelihood of this varies by OS...)

A better solution might be to hold on to e.g. the read end of a pipe that the parent has open for writing, and then use poll/select on the child side to determine if the pipe has been closed on the write end (i.e. poll will return that it can be read, but then reads will yield an EOFError or whatnot).

In ProcessPoolExecutor this could be done in a separate thread to periodically poll the pipe, or could be done right before the blocking call to call_queue.get to retrieve the next work item.

CPython versions tested on:

3.10.8, 3.12

Operating systems tested on:

Linux, macOS

@zpincus zpincus added the type-bug An unexpected behavior, bug, or error label Nov 9, 2023
@zpincus
Copy link
Author

zpincus commented Nov 9, 2023

Oops, hadn't realized that multiprocessing.Process objects have a sentinel property (specifically, one end of a pipe as describe above) specifically to see if the process has died, and an is_alive() method to do just that. Here's an example of checking if a parent process is dead:

import os
import time

def exit_when_orphaned():
    parent = multiprocessing.parent_process()
    while parent.is_alive():
        print('waiting')
        time.sleep(1)

def run():
    print(pid := os.getpid())
    process = multiprocessing.Process(target= exit_when_orphaned)
    process.start()
    time.sleep(3)
    os.kill(pid, 9)


if __name__ == "__main__":
    import sys
    multiprocessing.set_start_method(sys.argv[1])
    run()

So I think that if folks agree that the above is in fact a bug and that ProcessPoolExecutor workers should die if their parents go away, then the patch is easy: just do a parent.is_alive() check before call_queue.get in the worker process.

@zpincus
Copy link
Author

zpincus commented Nov 13, 2023

For anyone playing at home, you can get this effect on your own by passing the following initializer to ProcessPoolExecutor:

def start_orphan_checker():
    import threading

    def exit_if_orphaned():
        import multiprocessing
        multiprocessing.parent_process().join()  # wait for parent process to die first; may never happen
        os._exit(-1)

    threading.Thread(target=exit_if_orphaned, daemon=True).start()

with futures.ProcessPoolExecutor(initializer=start_orphan_checker) as executor:
    ...

@Yaro1
Copy link

Yaro1 commented Nov 29, 2023

Hi, I would like to take this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants