Skip to content

Data race in intern_common when interning str objects in the free threading build #129701

Closed
@colesbury

Description

@colesbury

Bug report

When running ./python -m test test_exceptions --parallel-threads=10 in a TSAN build:

WARNING: ThreadSanitizer: data race (pid=763025)
  Atomic read of size 4 at 0x7fffbe0718cc by thread T190:
    #0 _Py_atomic_load_uint32_relaxed Include/cpython/pyatomic_gcc.h:367 (python+0x1d386e) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #1 Py_INCREF Include/refcount.h:261 (python+0x1d386e)
    #2 _Py_NewRef Include/refcount.h:518 (python+0x1d386e)
    #3 dict_setdefault_ref_lock_held Objects/dictobject.c:4386 (python+0x1e6817) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #4 PyDict_SetDefaultRef Objects/dictobject.c:4403 (python+0x1e6a37) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #5 intern_common Objects/unicodeobject.c:15820 (python+0x2a7a8d) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #6 _PyUnicode_InternImmortal Objects/unicodeobject.c:15874 (python+0x2e92d1) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #7 _PyPegen_new_identifier Parser/pegen.c:549 (python+0xad61f) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
....

  Previous write of size 4 at 0x7fffbe0718cc by thread T182:
    #0 Py_SET_REFCNT Include/refcount.h:176 (python+0x2a7c9a) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #1 Py_SET_REFCNT Include/refcount.h:145 (python+0x2a7c9a)
    #2 intern_common Objects/unicodeobject.c:15848 (python+0x2a7c9a)
    #3 _PyUnicode_InternImmortal Objects/unicodeobject.c:15874 (python+0x2e92d1) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
    #4 _PyPegen_new_identifier Parser/pegen.c:549 (python+0xad61f) (BuildId: 5612db6eff0f51c7fd99ee4409b2ceafceea484c)
...

There are a few thread-safety issues with intern_common:

  • It can return a string that's not marked as interned because we insert into interned before we mark the string as interned. This can be fixed with some additional locking.
  • The Py_SET_REFCNT(s, Py_REFCNT(s) - 2) modification is not thread-safe with respect to other reference count modifications in the free threading build
  • _Py_SetImmortal is not thread-safe in some circumstances (see _Py_SetImmortal must be run on allocating thread (no-gil) #113956)

The _Py_SetImmortal() issue is tricky and I think it's unlikely to cause problems in practice, so I think we can defer dealing with that for now.

Linked PRs

Metadata

Metadata

Assignees

Labels

3.13bugs and security fixes3.14bugs and security fixestopic-free-threadingtype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions