Skip to content

Active thread list may be inaccurate due to data type mismatch #130115

Closed
@vfazio

Description

@vfazio

Bug report

Bug description:

0e9c364 changed thread_get_ident to convert a unsigned long long vs the previous unsigned long.

static PyObject *
thread_get_ident(PyObject *self, PyObject *Py_UNUSED(ignored))
{
    PyThread_ident_t ident = PyThread_get_thread_ident_ex();  // <-- ULL
    if (ident == PYTHREAD_INVALID_THREAD_ID) {
        PyErr_SetString(ThreadError, "no current thread ident");
        return NULL;
    }
    return PyLong_FromUnsignedLongLong(ident);
}

However, after #114839 commit 76bde03

MainThread is now a special case because it doesn't use self._set_ident():

class _MainThread(Thread):

    def __init__(self):
        Thread.__init__(self, name="MainThread", daemon=False)
        self._started.set()
        self._ident = _get_main_thread_ident()
        self._handle = _make_thread_handle(self._ident)
        if _HAVE_THREAD_NATIVE_ID:
            self._set_native_id()
        with _active_limbo_lock:
            _active[self._ident] = self

It inserts an identifier from a special function which is always the clipped unsigned long from the runtime struct into the active thread list.

static PyObject *
thread__get_main_thread_ident(PyObject *module, PyObject *Py_UNUSED(ignored))
{
    return PyLong_FromUnsignedLongLong(_PyRuntime.main_thread);
}
    /* Platform-specific identifier and PyThreadState, respectively, for the
       main thread in the main interpreter. */
    unsigned long main_thread;
    // Set it to the ID of the main thread of the main interpreter.
    runtime->main_thread = PyThread_get_thread_ident();

Because of this, on some platforms/libc implementations, we can observe a failure to look up the current thread because of the mismatch between clipped UL value vs the expected ULL value:

>>> import threading
>>> ct = threading.current_thread()
>>> ct
<_DummyThread(Dummy-1, started daemon 18446744072483979068)>
>>> hex(ct.ident)
'0xffffffffb6f33f3c'
>>> main = threading.main_thread()
>>> hex(main.ident)
'0xb6f33f3c'
>>> main._set_ident()
>>> hex(main.ident)
'0xffffffffb6f33f3c'

def current_thread():
    """Return the current Thread object, corresponding to the caller's thread of control.

    If the caller's thread of control was not created through the threading
    module, a dummy thread object with limited functionality is returned.

    """
    try:
        return _active[get_ident()]
    except KeyError:
        return _DummyThread()

Should main_thread to be a PyThread_ident_t ? or should MainThread continue to call _set_ident()?

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions