Skip to content

SSL session content bleeds into stdout with lots of threads #118138

Open
@sterwill

Description

@sterwill

Bug report

Bug description:

I've been trying to track down a rare segfault I'm seeing with Python 3 (several versions) in AWS Lambda and AWS CodeBuild. It's been very hard to reproduce, but I have a few stack traces that show the crash happens in OpenSSL's certificate verification code below Python's ssl module. While trying to isolate the problem in OpenSSL, I stumbled on a different issue that might be related, which I'm reporting here.

The program below uses the standard library to execute a lot of HTTPS requests concurrently with threads. When run on my Linux desktop in gnome-terminal, the program doesn't exhibit any weird behavior. It prints a bunch of X characters for each request it completes. However, if I pipe its output through a program like less or tee, I see additional output--raw bytes from the SSL network session bleeding into stdout. It feels like OpenSSL is writing into memory that Python's using to prepare strings or print to stdout due to a lack of synchronization or refcounting, but I don't know OpenSSL or Python internals well, so this is just a guess.

python3 test-program.py | tee /dev/null makes it happen every time on my workstation. It never crashes, it just bleeds SSL contents into the output.

It happens on Ubuntu 22.04 x86_64 with Python 3.12 and OpenSSL 3.0.2, and on AWS CloudShell with Python 3.9.16 and OpenSSL 3.0.8.

Feel free to run this program as often as you want with the URL I put in there (it's a small static web page on my personal site that won't mind a few thousand hits).

#!/usr/bin/env python3
import http.client
import logging
from concurrent.futures import ThreadPoolExecutor
from ssl import create_default_context, Purpose
from threading import Barrier

logger = logging.getLogger(__name__)

# Point this at your system's CA cert store.
CERT_FILE = '/tmp/cacert.pem'

# The problem reproduces reliably with 100 threads for me, and
# with more threads it happens even more times per execution.
#
# It also happens with a small number of threads (like 10) if
# you make do_one_request process lots of requests instead of one.
NUM_THREADS = 200

# The barrier is helpful for getting all the requests lined up
# to start together.  The problem still happens without a barrier
# if you run lots of requests in each thread instead of one.
b = Barrier(NUM_THREADS)


def do_one_request():
    try:
        # SSLContext is supposed to be thread-safe, but I allocate a new
        # one each time to rule out concurrent access to it being the problem.
        ssl_context = create_default_context(purpose=Purpose.SERVER_AUTH, cafile=CERT_FILE)
        b.wait()

        # Do a simple request and close the connection.  This URL is a small
        # static page behind AWS CloudFront owned by the bug reporter, so
        # don't feel bad about sending a few thousand requests there.  The
        # bug seems to happen with any valid HTTPS request.
        c = http.client.HTTPSConnection('tinfig.com', context=ssl_context)
        c.request('GET', 'https://tinfig.com/')
        _ = c.getresponse()
        c.close()

        # Print enough output so a pager or reading process might have to buffer.
        print(' ' + ('X' * 200))
    except Exception as e:
        logger.exception(e)


with ThreadPoolExecutor(max_workers=NUM_THREADS) as executor:
    for i in range(NUM_THREADS):
        executor.submit(do_one_request)
    executor.shutdown(wait=True)

CPython versions tested on:

3.9, 3.12

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions