Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-47029: Fix BrokenPipeError in multiprocessing.Queue at garbage collection and explicit close #31913

Merged
merged 4 commits into from
May 3, 2022

Conversation

geryogam
Copy link
Contributor

@geryogam geryogam commented Mar 15, 2022

Program:

import multiprocessing

def main():
    q = multiprocessing.Queue()
    q.put(0)

if __name__ == '__main__':
    main()

Output:

Traceback (most recent call last):
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/queues.py", line 251, in _feed
    send_bytes(obj)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes 
    self._send_bytes(m[offset:offset + size])
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes 
    self._send(header + buf)
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

A BrokenPipeError exception is raised when the function Queue._feed in the queue thread writes buffered data to the file descriptor self._writer referring to the write end of the pipe after the garbage collector closes the file descriptor self._reader referring to the read end of the pipe in the main thread. Note that the garbage collector does not close self._writer because it is referenced by Queue._feed in the queue thread.

This PR fixes the problem by passing a reference to self._reader to Queue._feed in the queue thread in order to keep the reader end of the pipe alive until the queue thread stops writing buffered data to self._writer.

https://bugs.python.org/issue47029

@geryogam geryogam requested a review from rhettinger as a code owner March 15, 2022 18:24
@geryogam geryogam changed the title Fix a BrokenPipeError when a multiprocessing.Queue is garbage collected bpo-47029: Fix a BrokenPipeError when a multiprocessing.Queue is garbage collected Mar 15, 2022
@malsyned
Copy link

I tried this pull request and didn't fix the problem for me. Might you be getting tricked by how racy this issue is into thinking that an unrelated change is resolving it?

I posted this on the issue ticket, but I'll include it here as well. This program pretty reliably reproduces the issue, both on your branch and on upstream:

import multiprocessing
import time

def main():
    q = multiprocessing.Queue()
    q.put(0)
    #time.sleep(0.001) # Issue goes away when this line is uncommented
    q.close()
    q.join_thread()

if __name__ == '__main__':
    main()

I have tested this with two Linux/x86_64 machines, one running Ubuntu 21.10 with kernel 5.13.0-37-lowlatency and one running Ubuntu 20.04 with kernel 5.4.0-100-generic.

@geryogam
Copy link
Contributor Author

geryogam commented Apr 26, 2022

I tried this pull request and didn't fix the problem for me. Might you be getting tricked by how racy this issue is into thinking that an unrelated change is resolving it?

Thanks for the code sample @malsyned. I have updated the PR and it now fixes your issue too.

The problem in your case was that the function Queue.close in the main thread explicitly closed the file descriptor self._reader referring to the read end of the pipe, then notified the function Queue._feed in the queue thread to stop writing buffered data to the file descriptor self._writer referring to the write end of the pipe. In other words, the main thread closed self._reader before the queue thread stopped writing to self._writer. I solved this by closing self._reader in Queue._feed instead of in Queue.close, so after the queue thread stops writing to self._writer.

The root of the problem is explained in this paragraph from the Linux manual page of pipe (bold emphasis mine):

If all file descriptors referring to the write end of a pipe have been closed, then an attempt to read(2) from the pipe will see end-of-file (read(2) will return 0). If all file descriptors referring to the read end of a pipe have been closed, then a write(2) will cause a SIGPIPE signal to be generated for the calling process. If the calling process is ignoring this signal, then write(2) fails with the error EPIPE. An application that uses pipe(2) and fork(2) should use suitable close(2) calls to close unnecessary duplicate file descriptors; this ensures that end-of-file and SIGPIPE/EPIPE are delivered when appropriate.

@geryogam geryogam changed the title bpo-47029: Fix a BrokenPipeError when a multiprocessing.Queue is garbage collected bpo-47029: Fix BrokenPipeError in multiprocessing.Queue at garbage collection and explicit close Apr 26, 2022
MaxwellDupre

This comment was marked as outdated.

@geryogam
Copy link
Contributor Author

geryogam commented May 1, 2022

Thanks for the review @MaxwellDupre!

@rhettinger rhettinger removed their request for review May 3, 2022 06:19
@geryogam
Copy link
Contributor Author

geryogam commented May 3, 2022

@JelleZijlstra can I request your review of this PR since version 3.11 beta 1 is about to be released? It is a bug fix, not a new feature, but I am not sure it can be released in next betas.

@JelleZijlstra
Copy link
Member

I can try but the multiprocessing module is mostly black magic to me. I'll spend a little time trying to understand what's going on here.

Is it possible to add a unit test for this behavior? And do you think it's safe to backport to the 3.9 and 3.10 branches?

@geryogam
Copy link
Contributor Author

geryogam commented May 3, 2022

Is it possible to add a unit test for this behavior?

I could add #31913 (comment) as unit test but the problem is that it is subject to a race condition so the test would not be 100% reliable.

And do you think it's safe to backport to the 3.9 and 3.10 branches?

I tested this PR on Python 3.9.12 on my MacBook Pro so I think so.

@JelleZijlstra JelleZijlstra self-assigned this May 3, 2022
@JelleZijlstra JelleZijlstra added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels May 3, 2022
@JelleZijlstra JelleZijlstra merged commit dfb1b9d into python:main May 3, 2022
@miss-islington
Copy link
Contributor

Thanks @maggyero for the PR, and @JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@bedevere-bot
Copy link

GH-92276 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed needs backport to 3.10 only security fixes needs backport to 3.9 only security fixes labels May 3, 2022
@bedevere-bot
Copy link

GH-92277 is a backport of this pull request to the 3.9 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 3, 2022
…llection and explicit close (pythonGH-31913)

(cherry picked from commit dfb1b9d)

Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
@geryogam
Copy link
Contributor Author

geryogam commented May 3, 2022

Thanks for the review and your amazing reactivity @JelleZijlstra!

@geryogam geryogam deleted the patch-27 branch May 3, 2022 23:55
miss-islington added a commit that referenced this pull request May 4, 2022
…llection and explicit close (GH-31913)

(cherry picked from commit dfb1b9d)

Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
miss-islington added a commit that referenced this pull request May 4, 2022
…llection and explicit close (GH-31913)

(cherry picked from commit dfb1b9d)

Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
hello-adam pushed a commit to hello-adam/cpython that referenced this pull request Jun 2, 2022
…llection and explicit close (pythonGH-31913)

(cherry picked from commit dfb1b9d)

Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants