Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-121313: Limit the reading size from pipes to their default buffer size on Unix systems #121315

Merged
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion Lib/multiprocessing/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import time
import tempfile
import itertools
import stat
Copy link
Contributor

@cmaloney cmaloney Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal nitpick, PEP-8 doesn't seem to specify (https://peps.python.org/pep-0008/#imports), but I like imports to be alphabetical. itertools, time, and tempfile which were already in the code just above this are also out of order (although time and tempfile only slightly). Rest are in order. Not sure if it matters for Python core developer acceptance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



from . import util
Expand Down Expand Up @@ -391,8 +392,17 @@ def _recv(self, size, read=_read):
buf = io.BytesIO()
handle = self._handle
remaining = size
is_pipe = False
page_size = 0
if not _winapi:
page_size = os.sysconf(os.sysconf_names['SC_PAGESIZE'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than do the if not _winapi here, which has to be run/interpreted per _recv call, can you add the "calculate max size for a fifo" like https://github.com/python/cpython/blob/main/Lib/multiprocessing/connection.py#L370-L379 does to choose/define the standard read function? Code here will still need to do the min logic + "is this a fifo", but at least reduces overhead work a little bit further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've shifted fetching the base page size and calculating the default pipe size to the existing if _winapi block above. Is this what you meant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looking good

if size > 16 * page_size:
mode = os.fstat(handle).st_mode
is_pipe = stat.S_ISFIFO(mode)
limit = 16 * page_size if is_pipe else remaining
while remaining > 0:
chunk = read(handle, remaining)
to_read = min(limit, remaining)
chunk = read(handle, to_read)
n = len(chunk)
if n == 0:
if remaining == size:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Limit reading size in os.read for pipes to default pipe size in order to avoid memory overallocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated from os.read -> multiprocessing to follow the logic location change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Loading