Skip to content

typing.py: builtin LRU caches worsen leaks that exist in other code #98253

Closed
@wjakob

Description

@wjakob

Bug report

I would like to report a refleak issue involving typing.py. The issue is that it internally uses LRU caches to cache certain type-related lookups, and these caches are not cleaned up when the Python interpreter shuts down. This causes leaks that impede software development and debugging of refleaks in general.

This specific part of typing.py has already once been identified as a source of refleaks by @gvanrossum (context: https://bugs.python.org/issue28649).

The following provides a small reproducer via a trivial package (https://github.com/wjakob/typing_repro) that exposes a class named A using nanobind. Why nanobind? It is extremely paranoid about any leaks involving bound types, functions, and instances, and prints warning messages to tell the user about this after the interpreter has shut down (it performs checks following finalization using Py_AtExit()).

preparation:

$ pip install git+https://github.com/wjakob/typing_repro

Reproducer:

from typing_repro import A

import pandas
import typing

def test(t: typing.Optional[A] = None):
    print(t)

Running this yields

nanobind: leaked 1 types!                                                                                         
 - leaked type "A"                                                                                                
nanobind: leaked 2 functions!
 - leaked function "add"
 - leaked function "__init__"
nanobind: this is likely caused by a reference counting issue in the binding code.

Note the import of pandas, which serves the role of a bigger package that uses the typing module and thereby populates the LRU caches. torch (PyTorch) or tensorflow also cause the issue, as does markupsafe, others likely affected as well.

EDIT: The problem that is common to all of these packages is that they leak some of their own types. For example, by Py_INCREFing references to heap types within extension modules. Because these types use typing.py and thereby reference the LRU caches (which are never cleaned up), it causes a flurry of refleaks that cascade into other packages.

Removing the test() function or removing the type annotation fixes the issue. The problem is that declaration causes cache entries to be created that are never cleaned up, even when the interpreter finalizes.

There is another way to avoid the issue: at the bottom of the script, insert

for f in typing._cleanups:
    f()

which clears the LRU caches in typing.py. Poof, errors gone. This leads me to suggest the following simple fix, to be added at the end of typing.py:

def _cleanup_handler():
    for f in _cleanups:
        f()

import atexit as _atexit
_atexit.register(_cleanup_handler)

This will clear the caches and ensure that interpreter finalization can avoid those type annotation-related leaks.

Your environment

  • CPython versions tested on: 3.8.10 and 3.10.;7
  • Operating system and architecture: Linux and macOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions