Skip to content

Deadlock in test_opcache with gh-131174 applied #133130

Open
@nascheme

Description

@nascheme

Bug report

This is to document the deadlock that occurs with gh-131174 applied. It seems a bit hard to reproduce and I had to repeatedly run the tests. When I got it to finally deadlock, the stack-trace for the three threads are as follows. Note that update_slots_callback() runs with the world stopped.

Thread 1:

[...]
#6  0x000055e8fa1ab214 in _PySemaphore_PlatformWait (sema=sema@entry=0x7ffefa17dda0, timeout=timeout@entry=-1)
    at Python/parking_lot.c:162
#7  0x000055e8fa1ab55c in _PySemaphore_Wait (sema=sema@entry=0x7ffefa17dda0, timeout=timeout@entry=-1,
    detach=detach@entry=0) at Python/parking_lot.c:232
#8  0x000055e8fa1ab6e1 in _PyParkingLot_Park (addr=addr@entry=0x55e8fa5a186c <_PyRuntime+361772>,
    expected=expected@entry=0x7ffefa17de24, size=size@entry=4, timeout_ns=timeout_ns@entry=-1,
    park_arg=park_arg@entry=0x0, detach=detach@entry=0) at Python/parking_lot.c:335
#9  0x000055e8fa1c1c43 in tstate_wait_attach (tstate=tstate@entry=0x55e8fa5a1840 <_PyRuntime+361728>)
    at Python/pystate.c:2145
#10 0x000055e8fa1c1df0 in _PyThreadState_Attach (tstate=tstate@entry=0x55e8fa5a1840 <_PyRuntime+361728>)
    at Python/pystate.c:2180
#11 0x000055e8fa17048f in PyEval_AcquireThread (tstate=tstate@entry=0x55e8fa5a1840 <_PyRuntime+361728>)
    at Python/ceval_gil.c:605
#12 0x000055e8fa1ab56b in _PySemaphore_Wait (sema=sema@entry=0x7ffefa17df00, timeout=timeout@entry=-1,
    detach=detach@entry=1) at Python/parking_lot.c:234
#13 0x000055e8fa1ab6e1 in _PyParkingLot_Park (addr=addr@entry=0x55e8fa599160 <_PyRuntime+327200>,
    expected=expected@entry=0x7ffefa17df97, size=size@entry=1, timeout_ns=timeout_ns@entry=-1,
    park_arg=park_arg@entry=0x7ffefa17dfa0, detach=detach@entry=1) at Python/parking_lot.c:335
#14 0x000055e8fa19fb6a in _PyMutex_LockTimed (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>, timeout=timeout@entry=-1,
    flags=flags@entry=_PY_LOCK_DETACH) at Python/lock.c:108
#15 0x000055e8fa1a0970 in PyMutex_Lock (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>) at Python/lock.c:611
#16 0x000055e8fa1435c6 in _PyMutex_Lock (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>) at ./Include/cpython/lock.h:49
#17 0x000055e8fa14378f in _PyCriticalSection_BeginSlow (c=c@entry=0x7ffefa17e110,
    m=m@entry=0x55e8fa599160 <_PyRuntime+327200>) at Python/critical_section.c:37
#18 0x000055e8f9fe8918 in _PyCriticalSection_BeginMutex (c=c@entry=0x7ffefa17e110,
    m=0x55e8fa599160 <_PyRuntime+327200>) at ./Include/internal/pycore_critical_section.h:119
#19 0x000055e8fa000264 in _PyType_LookupStackRefAndVersion (type=0x7f06d8afc810,
    name=name@entry=0x55e8fa558f28 <_PyRuntime+64488>, out=out@entry=0x7f071b60d190) at Objects/typeobject.c:5891
[...]

Thread 2:

[...]
#6  0x000055e8fa1ab214 in _PySemaphore_PlatformWait (sema=sema@entry=0x7f06cc4fd500, timeout=timeout@entry=-1)
    at Python/parking_lot.c:162
#7  0x000055e8fa1ab55c in _PySemaphore_Wait (sema=sema@entry=0x7f06cc4fd500, timeout=timeout@entry=-1,
    detach=detach@entry=0) at Python/parking_lot.c:232
#8  0x000055e8fa1ab6e1 in _PyParkingLot_Park (addr=addr@entry=0x72a40000f03c, expected=expected@entry=0x7f06cc4fd584,
    size=size@entry=4, timeout_ns=timeout_ns@entry=-1, park_arg=park_arg@entry=0x0, detach=detach@entry=0)
    at Python/parking_lot.c:335
#9  0x000055e8fa1c1c43 in tstate_wait_attach (tstate=tstate@entry=0x72a40000f010) at Python/pystate.c:2145
#10 0x000055e8fa1c1df0 in _PyThreadState_Attach (tstate=tstate@entry=0x72a40000f010) at Python/pystate.c:2180
#11 0x000055e8fa17048f in PyEval_AcquireThread (tstate=tstate@entry=0x72a40000f010) at Python/ceval_gil.c:605
#12 0x000055e8fa290c0f in thread_run (boot_raw=boot_raw@entry=0x7214000012d0) at ./Modules/_threadmodule.c:350
#13 0x000055e8fa1e682d in pythread_wrapper (arg=0x720c00000b80) at Python/thread_pthread.h:242
#14 0x00007f071c601def in __tsan_thread_start_func (arg=0x7ffefa17dd60)
    at ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1012
#15 0x00007f071c354aa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#16 0x00007f071c3e1c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
[...]

Thread 3

[...]
#6  0x000055e8fa1ab214 in _PySemaphore_PlatformWait (sema=sema@entry=0x7f06cbcfb4a0, timeout=timeout@entry=-1)
    at Python/parking_lot.c:162
#7  0x000055e8fa1ab55c in _PySemaphore_Wait (sema=sema@entry=0x7f06cbcfb4a0, timeout=timeout@entry=-1,
    detach=detach@entry=1) at Python/parking_lot.c:232
#8  0x000055e8fa1ab6e1 in _PyParkingLot_Park (addr=addr@entry=0x55e8fa599160 <_PyRuntime+327200>,
    expected=expected@entry=0x7f06cbcfb537, size=size@entry=1, timeout_ns=timeout_ns@entry=-1,
    park_arg=park_arg@entry=0x7f06cbcfb540, detach=detach@entry=1) at Python/parking_lot.c:335
#9  0x000055e8fa19fb6a in _PyMutex_LockTimed (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>, timeout=timeout@entry=-1,
    flags=flags@entry=_PY_LOCK_DETACH) at Python/lock.c:108
#10 0x000055e8fa1a0970 in PyMutex_Lock (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>) at Python/lock.c:611
#11 0x000055e8fa1435c6 in _PyMutex_Lock (m=m@entry=0x55e8fa599160 <_PyRuntime+327200>) at ./Include/cpython/lock.h:49
#12 0x000055e8fa143bf5 in _PyCriticalSection_Resume (tstate=tstate@entry=0x72a400019010)
    at Python/critical_section.c:109
#13 0x000055e8f9f805e2 in _PyCriticalSection_Pop (c=c@entry=0x7f06cbcfb670)
    at ./Include/internal/pycore_critical_section.h:141
#14 0x000055e8f9f80618 in _PyCriticalSection_End (c=c@entry=0x7f06cbcfb670)
    at ./Include/internal/pycore_critical_section.h:155
#15 0x000055e8f9f806c7 in ensure_shared_on_read (mp=mp@entry=0x7f06d94e1030) at Objects/dictobject.c:1326
#16 0x000055e8f9f860a4 in _Py_dict_lookup_threadsafe (mp=mp@entry=0x7f06d94e1030,
    key=key@entry=0x55e8fa5592b8 <_PyRuntime+65400>, hash=hash@entry=5605474849373301295,
    value_addr=value_addr@entry=0x7f06cbcfb740) at Objects/dictobject.c:1497
#17 0x000055e8f9f86e3f in _PyDict_GetItemRef_KnownHash (op=op@entry=0x7f06d94e1030,
    key=key@entry=0x55e8fa5592b8 <_PyRuntime+65400>, hash=hash@entry=5605474849373301295,
    result=result@entry=0x7f06cbcfb7a0) at Objects/dictobject.c:2370
#18 0x000055e8f9ffff81 in find_name_in_mro (type=type@entry=0x7f06d9691410, name=0x55e8fa5592b8 <_PyRuntime+65400>,
    error=error@entry=0x7f06cbcfb824) at Objects/typeobject.c:5721
#19 0x000055e8fa007d9b in update_one_slot (type=type@entry=0x7f06d9691410, p=0x55e8fa5259a0 <slotdefs>)
    at Objects/typeobject.c:11292
#20 0x000055e8fa007f9b in update_slots_callback (type=type@entry=0x7f06d9691410, data=data@entry=0x7f06cbcfb8c0)
    at Objects/typeobject.c:11390
[...]

Inside the _PyCriticalSection_Resume of the 3rd thread, it's waiting on m1:

#12 0x000055e8fa143bf5 in _PyCriticalSection_Resume (tstate=tstate@entry=0x72a400019010)
    at Python/critical_section.c:109
109             PyMutex_Lock(m1);
(gdb) l
104             m2 = c2->_cs_mutex2;
105             c2->_cs_mutex2 = NULL;
106         }
107
108         if (m1) {
109             PyMutex_Lock(m1);
110         }
111         if (m2) {
112             PyMutex_Lock(m2);
113         }
(gdb) p *m1
$9 = {_bits = 3 '\003'}
(gdb) p *m2
$10 = {_bits = 0 '\000'}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions