Description
Bug report
I would like to report a refleak issue involving typing.py
. The issue is that it internally uses LRU caches to cache certain type-related lookups, and these caches are not cleaned up when the Python interpreter shuts down. This causes leaks that impede software development and debugging of refleaks in general.
This specific part of typing.py
has already once been identified as a source of refleaks by @gvanrossum (context: https://bugs.python.org/issue28649).
The following provides a small reproducer via a trivial package (https://github.com/wjakob/typing_repro) that exposes a class named A
using nanobind
. Why nanobind
? It is extremely paranoid about any leaks involving bound types, functions, and instances, and prints warning messages to tell the user about this after the interpreter has shut down (it performs checks following finalization using Py_AtExit()
).
preparation:
$ pip install git+https://github.com/wjakob/typing_repro
Reproducer:
from typing_repro import A
import pandas
import typing
def test(t: typing.Optional[A] = None):
print(t)
Running this yields
nanobind: leaked 1 types!
- leaked type "A"
nanobind: leaked 2 functions!
- leaked function "add"
- leaked function "__init__"
nanobind: this is likely caused by a reference counting issue in the binding code.
Note the import of pandas
, which serves the role of a bigger package that uses the typing
module and thereby populates the LRU caches. torch
(PyTorch) or tensorflow
also cause the issue, as does markupsafe
, others likely affected as well.
EDIT: The problem that is common to all of these packages is that they leak some of their own types. For example, by Py_INCREF
ing references to heap types within extension modules. Because these types use typing.py
and thereby reference the LRU caches (which are never cleaned up), it causes a flurry of refleaks that cascade into other packages.
Removing the test()
function or removing the type annotation fixes the issue. The problem is that declaration causes cache entries to be created that are never cleaned up, even when the interpreter finalizes.
There is another way to avoid the issue: at the bottom of the script, insert
for f in typing._cleanups:
f()
which clears the LRU caches in typing.py
. Poof, errors gone. This leads me to suggest the following simple fix, to be added at the end of typing.py
:
def _cleanup_handler():
for f in _cleanups:
f()
import atexit as _atexit
_atexit.register(_cleanup_handler)
This will clear the caches and ensure that interpreter finalization can avoid those type annotation-related leaks.
Your environment
- CPython versions tested on: 3.8.10 and 3.10.;7
- Operating system and architecture: Linux and macOS