Skip to content

IOBasePayload reads remaining_content_length bytes at once instead of chunking #11138

@Eijebong

Description

@Eijebong

Describe the bug

After upgrading aiohttp from 3.11.18 to 3.12.7, we've started seeing some use cases OoM. After investigation, it looks like aiohttp started reading files in memory when using data=fd.

After writing a test case similar to our code that reproduced the issue, I bisected it back to

commit 0c161025b5f0f15f13e66cc1cba906e2428cc276 (HEAD)
Author: J. Nick Koston <nick@koston.org>
Date:   Thu May 22 10:22:46 2025 -0500

    [PR #10915/545783b backport][3.12] Fix connection reuse for file-like data payloads (#10947)

To Reproduce

  • Get the two files below
  • mkdir -p data && for i in $(seq 0 4); do dd if=/dev/urandom of=data/${i}.dat bs=1M count=1000; done
  • Run server.py, it's just there to provide something to talk to, it's fairly irrelevant to the issue
  • valgrind --tool=massif --massif-out-file=massif.out python client.py && massif-visualizer massif.out

client.py

import aiohttp
import asyncio

PUT_URL = "http://127.0.0.1:8080"


async def create_artifact(path, session):
    with open(path, "rb") as fh:
        print(f"Started upload for {path}")
        async with session.put(
            PUT_URL,
            data=fh,
            compress=False,
        ) as resp:
            response_text = await resp.text()
            print(response_text)
            if resp.status not in (200, 204):
                raise RuntimeError("Bad status {}".format(resp.status))


async def upload_all(paths):
    async with aiohttp.ClientSession() as session:
        def post_one(path):
            return asyncio.ensure_future(create_artifact(path, session))

        tasks = list(map(post_one, paths))
        await raise_future_exceptions(tasks)


async def raise_future_exceptions(tasks):
    succeeded_results = []

    if tasks:
        await asyncio.wait(tasks)
        for task in tasks:
            exc = task.exception()
            if exc:
                raise exc
            else:
                succeeded_results.append(task.result())

    return succeeded_results

asyncio.run(upload_all([f"data/{i}.dat" for i in range(0, 5)]))

server.py

import asyncio

from aiohttp import web

async def hello(request):
    data = await request.read()
    print(len(data))
    return web.Response(text="Hello, world")

app = web.Application(client_max_size=2**40)
app.add_routes([web.put('/', hello)])
web.run_app(app)

Expected behavior

I'd expect the files to not be read in memory.

Massif output with 3.11.18

Image

Massif output with 0c16102 and onwards

Image

Logs/tracebacks

See screenshots above, I'd attach the in this but github isn't letting me for some reason

Python Version

$ python --version

Python 3.11.11

aiohttp Version

$ python -m pip show aiohttp

The first bad commit for aiohttp is 0c161025b5f0f15f13e66cc1cba906e2428cc276

Anything between 3.11.18 and that commit doesn't exhibit this behavior. Anything after does

multidict Version

$ python -m pip show multidict

Name: multidict
Version: 6.4.4
Location: /home/eijemoz/test-aiohttp/.venv/lib/python3.11/site-packages
Requires:
Required-by: aiohttp, yarl

propcache Version

$ python -m pip show propcache


Name: propcache
Version: 0.3.1
Location: /home/eijemoz/test-aiohttp/.venv/lib/python3.11/site-packages
Requires:
Required-by: aiohttp, yarl

yarl Version

$ python -m pip show yarl


Name: yarl
Version: 1.20.0
Location: /home/eijemoz/test-aiohttp/.venv/lib/python3.11/site-packages
Requires: idna, multidict, propcache
Required-by: aiohttp

OS

Arch linux

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions