Skip to content

Fatal Python error: _PyMem_DebugFree: Python memory allocator called without holding the GIL #325

Closed
@jamadden

Description

@jamadden

We fixed something similar earlier, but this has a distinct message and backtrace.

This is a rare, threading-based issue at shutdown, and only when development mode is enabled.

Occasionally, when CI is running python -W ignore -u dns_mass_resolve.py (usually from python -m gevent.tests.test__examples via python -m gevent.tests), the ubuntu builds (running with PYTHONDEVMODE=1) will crash:

  Fatal Python error: _PyMem_DebugFree: Python memory allocator called without holding the GIL
    Python runtime state: finalizing (tstate=0x55b67891e5b0)
    
    Thread 0x00007f2cf58ca740 (most recent call first):
    <no Python frame>

That test runs multiple greenlets in multiple threads, and exits the main thread at an arbitrary time, when many of the other threads are still busy.

In one core dump, I observe the main thread finalizing the interpreter. It's manipulating a dict, so it should be holding the GIL:

(gdb) thread 7
[Switching to thread 7 (Thread 0x7f11e4a9b000 (LWP 47501))]
#0  0x00000000004f62a2 in unicode_get_hash (o='GREENLET_USE_STANDARD_THREADING') at ../Objects/dictobject.c:288
288	../Objects/dictobject.c: No such file or directory.
(gdb) bt
#0  0x00000000004f62a2 in unicode_get_hash (o='GREENLET_USE_STANDARD_THREADING') at ../Objects/dictobject.c:288
#1  _PyDict_Next (op=<optimized out>, ppos=0x7ffc92d9c6c8, pkey=0x7ffc92d9c6c0, pvalue=0x7ffc92d9c6b8, phash=phash@entry=0x0) at ../Objects/dictobject.c:2155
#2  0x00000000004f637b in PyDict_Next (op=<optimized out>, ppos=<optimized out>, pkey=<optimized out>, pvalue=<optimized out>) at ../Objects/dictobject.c:2202
#3  0x0000000000512327 in _PyModule_ClearDict (
    d={'__name__': 'greenlet._greenlet', '__doc__': None, '__package__': 'greenlet', '__loader__': <ExtensionFileLoader(name='greenlet._greenlet', path='/greenlet/src/greenlet/_greenlet.cpython-311d-x86_64-linux-gnu.so') at remote 0x7f11e45aa680>, '__spec__': <ModuleSpec(name='greenlet._greenlet', loader=<...>, origin='/greenlet/src/greenlet/_greenlet.cpython-311d-x86_64-linux-gnu.so', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached=None, _initializing=False) at remote 0x7f11e45aa5e0>, 'getcurrent': <built-in method getcurrent of module object at remote 0x7f11e45b07d0>, 'settrace': <built-in method settrace of module object at remote 0x7f11e45b07d0>, 'gettrace': <built-in method gettrace of module object at remote 0x7f11e45b07d0>, 'set_thread_local': <built-in method set_thread_local of module object at remote 0x7f11e45b07d0>, 'get_pending_cleanup_count': <built-in method get_pending_cleanup_count of module object at remote 0x7f11e45b07d0>, 'get_total_ma...(truncated)) at ../Objects/moduleobject.c:602
#4  0x00000000005126c8 in _PyModule_Clear (m=<optimized out>) at ../Objects/moduleobject.c:582
#5  0x0000000000624aaf in finalize_modules_clear_weaklist (interp=interp@entry=0xb2a1d8 <_PyRuntime+58936>,
    weaklist=weaklist@entry=[('sys', <weakref.ReferenceType at remote 0x7f11e06d7bd0>), ('builtins', <weakref.ReferenceType at remote 0x7f11e06d7cb0>), ('_frozen_importlib', <weakref.ReferenceType at remote 0x7f11e06d7d20>), ('_imp', <weakref.ReferenceType at remote 0x7f11e06d7850>), ('_thread', <weakref.ReferenceType at remote 0x7f11e06d77e0>), ('_warnings', <weakref.ReferenceType at remote 0x7f11e06d7f50>), ('_weakref', <weakref.ReferenceType at remote 0x7f11e06d7c40>), ('_io', <weakref.ReferenceType at remote 0x7f11e06d4d70>), ('marshal', <weakref.ReferenceType at remote 0x7f11e06a7a80>), ('posix', <weakref.ReferenceType at remote 0x7f11e06a7bd0>), ('_frozen_importlib_external', <weakref.ReferenceType at remote 0x7f11e06a41a0>), ('time', <weakref.ReferenceType at remote 0x7f11e06a7af0>), ('zipimport', <weakref.ReferenceType at remote 0x7f11e47c81a0>), ('faulthandler', <weakref.ReferenceType at remote 0x7f11e408a820>), ('_codecs', <weakref.ReferenceType at remote 0x7f11e408a5f0>), ('codecs', <weakref.ReferenceType at remote 0x7f11e408...(truncated), verbose=verbose@entry=0) at ../Python/pylifecycle.c:1497
#6  0x0000000000624c6c in finalize_modules (tstate=tstate@entry=0xb44558 <_PyRuntime+166328>) at ../Python/pylifecycle.c:1579
#7  0x0000000000625b4a in Py_FinalizeEx () at ../Python/pylifecycle.c:1831
#8  0x0000000000650c6e in Py_RunMain () at ../Modules/main.c:682
#9  0x0000000000650cbe in pymain_main (args=args@entry=0x7ffc92d9c7f0) at ../Modules/main.c:710
#10 0x0000000000650d42 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at ../Modules/main.c:734
#11 0x00000000004248ef in main (argc=<optimized out>, argv=<optimized out>) at ../Programs/python.c:15

The thread that crashed is deallocating an object at the exit of a greenlet (src/greenlet/greenlet.cpp:1296 is the closing brace of g_initialstub so we are destructing stack-based C++ objects — in this case, probably the reference to the run() function):

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f11e3fb5640 (LWP 47503))]
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139714816071232) at ./nptl/pthread_kill.c:44
44	in ./nptl/pthread_kill.c
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139714816071232) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=139714816071232) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139714816071232, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f11e4ade476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f11e4ac47f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000000000625f20 in fatal_error_exit (status=status@entry=-1) at ../Python/pylifecycle.c:2624
#6  0x000000000062c18f in fatal_error (fd=<optimized out>, header=header@entry=1, prefix=prefix@entry=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree", msg=msg@entry=0x78a8d0 "Python memory allocator called without holding the GIL", status=status@entry=-1) at ../Python/pylifecycle.c:2735
#7  0x000000000062c2b4 in _Py_FatalErrorFunc (func=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree", msg=0x78a8d0 "Python memory allocator called without holding the GIL") at ../Python/pylifecycle.c:2821
#8  0x000000000051883d in _PyMem_DebugCheckGIL (func=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree") at ../Objects/obmalloc.c:2683
#9  _PyMem_DebugFree (ctx=0xa23290 <_PyMem_Debug+48>, ptr=0xcf3a40) at ../Objects/obmalloc.c:2707
#10 0x000000000050fde3 in PyMem_Free (ptr=<optimized out>) at ../Objects/obmalloc.c:652
#11 0x00000000006d1ce1 in _PyFaulthandler_Fini () at ../Modules/faulthandler.c:1433
#12 0x000000000062c23e in fatal_error (fd=2, header=header@entry=1, prefix=prefix@entry=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree", msg=msg@entry=0x78a8d0 "Python memory allocator called without holding the GIL", status=status@entry=-1) at ../Python/pylifecycle.c:2793
#13 0x000000000062c2b4 in _Py_FatalErrorFunc (func=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree", msg=0x78a8d0 "Python memory allocator called without holding the GIL") at ../Python/pylifecycle.c:2821
#14 0x000000000051883d in _PyMem_DebugCheckGIL (func=0x78b030 <__func__.13.lto_priv.1> "_PyMem_DebugFree") at ../Objects/obmalloc.c:2683
#15 _PyMem_DebugFree (ctx=0xa232c0 <_PyMem_Debug+96>, ptr=0x7f11e0694ac0) at ../Objects/obmalloc.c:2707
#16 0x000000000050fe53 in PyObject_Free (ptr=<optimized out>) at ../Objects/obmalloc.c:741
#17 0x0000000000653592 in PyObject_GC_Del (op=<optimized out>) at ../Modules/gcmodule.c:2363
#18 0x00000000004bb2d5 in method_dealloc (im=0x7f11e0694ad0) at ../Objects/classobject.c:242
#19 0x0000000000513103 in _Py_Dealloc (op=<optimized out>) at ../Objects/object.c:2389
#20 0x00007f11e4558436 in Py_DECREF (op=<optimized out>, lineno=399, filename=0x7f11e455b068 "src/greenlet/greenlet_refs.hpp") at /usr/include/python3.11d/object.h:527
#21 greenlet::refs::OwnedReference<_object, &greenlet::refs::NoOpChecker>::~OwnedReference (this=<optimized out>, __in_chrg=<optimized out>) at src/greenlet/greenlet_refs.hpp:399
#22 0x00007f11e4556e13 in greenlet::UserGreenlet::g_initialstub (this=<optimized out>, mark=0x7f11e3fb49c8) at src/greenlet/greenlet.cpp:1296
#23 0x00007f11e4555fb7 in greenlet::UserGreenlet::g_switch (this=0x7f11e4021380) at src/greenlet/greenlet.cpp:1112
#24 0x00007f11e45574f1 in green_switch (self=0x7f11e3ff4aa0, args=<optimized out>, kwargs=<optimized out>) at src/greenlet/greenlet.cpp:2239
#25 0x00000000004c4e41 in method_vectorcall_VARARGS_KEYWORDS (func=<method_descriptor at remote 0x7f11e45b1310>, args=0x7f11e063f088, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/descrobject.c:364
#26 0x00000000004b88bc in _PyObject_VectorcallTstate (tstate=0xd5b770, callable=<method_descriptor at remote 0x7f11e45b1310>, args=0x7f11e063f088, nargsf=9223372036854775809, kwnames=0x0) at ../Include/internal/pycore_call.h:92
#27 0x00000000004b89e3 in PyObject_Vectorcall (callable=<optimized out>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:299
#28 0x00000000005d5692 in _PyEval_EvalFrameDefault (tstate=0xd5b770, frame=0x7f11e063f020, throwflag=<optimized out>) at ../Python/ceval.c:4772
#29 0x00000000005dd02b in _PyEval_EvalFrame (throwflag=0, frame=0x7f11e063f020, tstate=0xd5b770) at ../Include/internal/pycore_ceval.h:73
#30 _PyEval_Vector (tstate=0xd5b770, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at ../Python/ceval.c:6428
#31 0x00000000004b4016 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:393
#32 0x00000000004b8d3d in _PyObject_VectorcallTstate (tstate=tstate@entry=0xd5b770, callable=callable@entry=<function at remote 0x7f11e4046780>, args=args@entry=0x7f11e3fb4d18, nargsf=nargsf@entry=1, kwnames=kwnames@entry=0x0) at ../Include/internal/pycore_call.h:92
#33 0x00000000004ba96e in method_vectorcall (method=<optimized out>, args=0xb2a1d0 <_PyRuntime+58928>, nargsf=<optimized out>, kwnames=0x0) at ../Objects/classobject.c:67
#34 0x00000000004ba2d9 in _PyVectorcall_Call (tstate=tstate@entry=0xd5b770, func=0x4ba7fa <method_vectorcall>, callable=callable@entry=<method at remote 0x7f11e404a6f0>, tuple=tuple@entry=(), kwargs=kwargs@entry=0x0) at ../Objects/call.c:245
#35 0x00000000004ba5df in _PyObject_Call (tstate=0xd5b770, callable=<method at remote 0x7f11e404a6f0>, args=(), kwargs=0x0) at ../Objects/call.c:328
#36 0x00000000004ba648 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at ../Objects/call.c:355
#37 0x000000000072e004 in thread_run (boot_raw=boot_raw@entry=0x7f11e4037510) at ../Modules/_threadmodule.c:1082
#38 0x0000000000638cf1 in pythread_wrapper (arg=<optimized out>) at ../Python/thread_pthread.h:241
#39 0x00007f11e4b30b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#40 0x00007f11e4bc1bb4 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

All the other threads are blocked in the poll() system call via a socket function, and so shouldn't be holding the GIL.

The question is, how are we exiting a greenlet while not holding the GIL? By definition, we shouldn't have been able to switch to it without holding the GIL.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions