BUG: Fix data races in block internals#63783
Conversation
203a6b5 to
deb55bf
Compare
and is_monotonic_decreasing
deb55bf to
8f74dac
Compare
|
I edited the description to make it more descriptive - sorry for forgetting to do that before marking it ready for review! |
|
Well that's annoying: The Windows Python 3.12 tests failed inside the new multithreaded test with a spurious warning: https://github.com/pandas-dev/pandas/actions/runs/21448817886/job/61771847840?pr=63783#step:5:335 Warnings are known not to be thread-safe before Python 3.14, so I'm tempted to just skip the new multithreaded tests on Python versions where warnings aren't thread-safe. @mroeschke does that seem OK to you? |
|
|
||
|
|
||
| @td.skip_if_warnings_arent_thread_safe | ||
| def test_multithreaded_reading(): |
There was a problem hiding this comment.
Do you think this tests is reliable (non-flaky) to run with pytest-xdist on the GHA runners? Decorating this tests with @pytest.mark.single_cpu will run this test without xdist
There was a problem hiding this comment.
Good call, I added the mark and also added some explanatory comments.
Sure that's fine with me. The |
|
I've tested this PR against the dask/dask test suite, where we're currently observing CI flakiness in all GIT-enabled Python versions using pandas 3.0. Result of local tests:
I consider this PR as a blocker for 3.14t support in dask.dataframe. More detailed results in dask/dask#12225 (comment) |
pandas/util/_test_decorators.py
Outdated
| WASM, | ||
| reason="does not support wasm", | ||
| ) | ||
| skip_if_warnings_arent_thread_safe = pytest.mark.skipif( |
There was a problem hiding this comment.
| skip_if_warnings_arent_thread_safe = pytest.mark.skipif( | |
| skip_if_thread_unsafe_warnings = pytest.mark.skipif( |
naming nit
|
@mroeschke This is ready to merged, we will followup to add a TSAN CI job once this gets merged. |
|
Thanks all! |
…rnals) (#64078) Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
Bump to pandas-dev/pandas#63783 Fix test_repartition_partition_size Use pandas-nightly; use pyarrow from conda-forge
Bump to pandas-dev/pandas#63783 Fix test_repartition_partition_size Use pandas-nightly; use pyarrow from conda-forge
This PR fixes several thread safety issues in Pandas internals on the free-threaded build. We found all of them using thread sanitizer testing.
I also added a multithreaded test based on the original test in statsmodels we used to find these issues. There's probably lots more room to add more multithreaded tests, but a real-world example that triggered real issues seems as good a place to start as any to me.
You can read more about critical sections in the CPython docs, the cython docs, and the free-threading guide.
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.AGENTS.md.