Skip to content

Occasional errors with free-threading #5674

Closed
@henryiii

Description

@henryiii

So far these are always on 3.14t.

ubuntu 3.14t:
ubuntu 3.13t:

The following:

test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_acquired]
test_run_in_process_multiple_threads_parallel[test_cross_module_gil_inner_pybind11_released]
test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_released]

=================================== FAILURES ===================================
_ test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_released] _

test_fn = <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>

    @pytest.mark.skipif(sys.platform.startswith("emscripten"), reason="Requires threads")
    @pytest.mark.parametrize("test_fn", ALL_BASIC_TESTS_PLUS_INTENTIONAL_DEADLOCK)
    @pytest.mark.skipif(
        "env.GRAALPY",
        reason="GraalPy transiently complains about unfinished threads at process exit",
    )
    def test_run_in_process_multiple_threads_parallel(test_fn):
        """Makes sure there is no GIL deadlock when running in a thread multiple times in parallel.
    
        It runs in a separate process to be able to stop and assert if it deadlocks.
        """
>       assert _run_in_process(_run_in_threads, test_fn, num_threads=8, parallel=True) == 0
E       assert -11 == 0
E        +  where -11 = _run_in_process(_run_in_threads, <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>, num_threads=8, parallel=True)

test_fn    = <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>

../../tests/test_gil_scoped.py:241: AssertionError
=============================== warnings summary ===============================
<frozen importlib._bootstrap>:491
  <frozen importlib._bootstrap>:491: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'exo_planet_c_api', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

macOS 3.14t:

______________ test_run_in_process_direct[_intentional_deadlock] _______________

test_fn = <function _intentional_deadlock at 0x2898f7a4740>

    @pytest.mark.skipif(sys.platform.startswith("emscripten"), reason="Requires threads")
    @pytest.mark.parametrize("test_fn", ALL_BASIC_TESTS_PLUS_INTENTIONAL_DEADLOCK)
    @pytest.mark.skipif(
        "env.GRAALPY",
        reason="GraalPy transiently complains about unfinished threads at process exit",
    )
    def test_run_in_process_direct(test_fn):
        """Makes sure there is no GIL deadlock when using processes.
    
        This test is for completion, but it was never an issue.
        """
>       assert _run_in_process(test_fn) == 0

test_fn    = <function _intentional_deadlock at 0x2898f7a4740>

test_gil_scoped.py:269: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

target = <function _intentional_deadlock at 0x2898f7a4740>, args = ()
kwargs = {}, test_fn = <function _intentional_deadlock at 0x2898f7a4740>
timeout = 0.1
process = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>
t_start = 1747718249.89369, t_delta = 0.10827970504760742, @py_assert1 = 0
@py_assert4 = None, @py_assert3 = False
@py_format6 = "0\n{0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode\n} is None"
@py_format8 = "assert 0\n{0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode\n} is None"

    def _run_in_process(target, *args, **kwargs):
        test_fn = target if len(args) == 0 else args[0]
        # Do not need to wait much, 10s should be more than enough.
        timeout = 0.1 if test_fn is _intentional_deadlock else 10
        process = multiprocessing.Process(target=target, args=args, kwargs=kwargs)
        process.daemon = True
        try:
            t_start = time.time()
            process.start()
            if timeout >= 100:  # For debugging.
                print(
                    "\nprocess.pid STARTED", process.pid, (sys.argv, target, args, kwargs)
                )
                print(f"COPY-PASTE-THIS: gdb {sys.argv[0]} -p {process.pid}", flush=True)
            process.join(timeout=timeout)
            if timeout >= 100:
                print("\nprocess.pid JOINED", process.pid, flush=True)
            t_delta = time.time() - t_start
            if process.exitcode == 66 and m.defined_THREAD_SANITIZER:  # Issue #2754
                # WOULD-BE-NICE-TO-HAVE: Check that the message below is actually in the output.
                # Maybe this could work:
                # https://gist.github.com/alexeygrigorev/01ce847f2e721b513b42ea4a6c96905e
                pytest.skip(
                    "ThreadSanitizer: starting new threads after multi-threaded fork is not supported."
                )
            elif test_fn is _intentional_deadlock:
>               assert process.exitcode is None
E               AssertionError: assert 0 is None
E                +  where 0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode

args       = ()
kwargs     = {}
process    = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>
t_delta    = 0.10827970504760742
t_start    = 1747718249.89369
target     = <function _intentional_deadlock at 0x2898f7a4740>
test_fn    = <function _intentional_deadlock at 0x2898f7a4740>
timeout    = 0.1

test_gil_scoped.py:187: AssertionError
=============================== warnings summary ===============================
<frozen importlib._bootstrap>:491
  <frozen importlib._bootstrap>:491: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'exo_planet_c_api', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

I haven't been able to reproduce the flakes locally, include with pytest-run-parallel, pytest-repeat, and reducing and increasing sys.setswitchinterval().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions