GH-120754: Make PY_READ_MAX smaller than max byteobject size by cmaloney · Pull Request #121633 · python/cpython

cmaloney · 2024-07-11T20:55:22Z

Currently if code tries to do a os.read larger than the max bytes object length, the size to read gets capped to _PY_READ_MAX, then the code tries to allocate a PyBytes which fails with an OverflowError as the size is longer than what is allocatable.

Since os.read is capping the max size anyways, cap it to a size which is always allocatable as a PyBytes.

This changes behavior from bpo-21932 and enables the large file os.read test on 32 bit platforms, as it should cap the read to a platform acceptable size.

Issue: Speed up open().read() pattern by reducing the number of system calls #120754

Currently if code tries to do a os.read larger than the max bytes object length, the size to read gets capped to `_PY_READ_MAX`, then the code tries to allocate a PyBytes which fails with an OverflowError as the size is larger than the max py bytes object. Since os.read is capping the max size anyways, cap it to a size which is always allocatable as a PyBytes. This changes behavior from bpo-21932 and enables the large file os.read test on 32 bit platforms, as it should cap the read to a platform acceptable size.

cmaloney · 2024-07-15T23:57:19Z

Misc/NEWS.d/next/Core and Builtins/2024-07-11-13-57-50.gh-issue-120754.C1HedA.rst

+Cap read size to smaller than the max BytesObject size. read() in POSIX
+returns at most the number of requseted bytes, this updates python ``os.read``
+to do similarly, and rather than throw an OverflowError in this case, return
+a smaller than requseted byte object.


cmaloney · 2024-07-17T05:49:33Z

Lib/test/test_os.py

-    # Py_ssize_t type
-    @unittest.skipUnless(INT_MAX < PY_SSIZE_T_MAX,
-                         "needs INT_MAX < PY_SSIZE_T_MAX")
    @support.bigmemtest(size=INT_MAX + 10, memuse=1, dry_run=False)


issue here is that this will result in a memory allocation error on 32 bit machines, and this bigmemtest effectively makes it not run on 32 bit machines (They are unlikely to have that much RAM)

bedevere-app bot mentioned this pull request Jul 11, 2024

Speed up open().read() pattern by reducing the number of system calls #120754

Closed

Add news blurb

c60ae27

cmaloney changed the title ~~gh-120754: Make PY_READ_MAX smaller than max byteobject size~~ GH-120754: Make PY_READ_MAX smaller than max byteobject size Jul 11, 2024

cmaloney commented Jul 15, 2024

View reviewed changes

cmaloney added 2 commits July 16, 2024 22:39

tweak news

da8b55c

tweak news more

1f4d053

cmaloney commented Jul 17, 2024

View reviewed changes

cmaloney closed this Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GH-120754: Make PY_READ_MAX smaller than max byteobject size#121633

GH-120754: Make PY_READ_MAX smaller than max byteobject size#121633
cmaloney wants to merge 4 commits intopython:mainfrom
cmaloney:cmaloney/os_read_large_32bit

cmaloney commented Jul 11, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

cmaloney Jul 15, 2024

Uh oh!

cmaloney Jul 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cmaloney commented Jul 11, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmaloney Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

cmaloney Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cmaloney commented Jul 11, 2024 •

edited by bedevere-app bot

Loading