Skip to content

gh-129107: make bytearray free-thread safe #129108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Feb 15, 2025

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Jan 20, 2025

Most of the work has been wrapping functions that need it in critical sections (one at a time, testing each one). There were some functions which were pointed directly at stringlib methods and needed critical sections, but since stringlib is shared with str and bytes and those immutable types don't need locks, the stringlib functions were not touched. Instead, they were wrapped in functions which take object locks.

There is work left to do, testing and doing the iterator. But starting as it is, this PR passes the test suite (at least on my system and build) as well as fixing the error cases presented in the script in the associated issue.

Functions which were looked at ("-" means no critical section was needed, otherwise function was fixed in one way or another):

* bytearray_alloc               = {"__alloc__", bytearray_alloc, METH_NOARGS, alloc_doc},
* bytearray_reduce              = BYTEARRAY_REDUCE_METHODDEF
* bytearray_reduce_ex           = BYTEARRAY_REDUCE_EX_METHODDEF
* bytearray_sizeof              = BYTEARRAY_SIZEOF_METHODDEF
* bytearray_append              = BYTEARRAY_APPEND_METHODDEF
- bytearray_clear               = BYTEARRAY_CLEAR_METHODDEF
* bytearray_copy                = BYTEARRAY_COPY_METHODDEF
b bytearray_count               = BYTEARRAY_COUNT_METHODDEF
* bytearray_decode              = BYTEARRAY_DECODE_METHODDEF
b bytearray_endswith            = BYTEARRAY_ENDSWITH_METHODDEF
* bytearray_extend              = BYTEARRAY_EXTEND_METHODDEF
b bytearray_find                = BYTEARRAY_FIND_METHODDEF
- bytearray_fromhex             = BYTEARRAY_FROMHEX_METHODDEF
* bytearray_hex                 = BYTEARRAY_HEX_METHODDEF
b bytearray_index               = BYTEARRAY_INDEX_METHODDEF
* bytearray_insert              = BYTEARRAY_INSERT_METHODDEF
s bytearray_join                = BYTEARRAY_JOIN_METHODDEF
* bytearray_lstrip              = BYTEARRAY_LSTRIP_METHODDEF
- bytearray_maketrans           = BYTEARRAY_MAKETRANS_METHODDEF
s bytearray_partition           = BYTEARRAY_PARTITION_METHODDEF
* bytearray_pop                 = BYTEARRAY_POP_METHODDEF
* bytearray_remove              = BYTEARRAY_REMOVE_METHODDEF
s bytearray_replace             = BYTEARRAY_REPLACE_METHODDEF
* bytearray_removeprefix        = BYTEARRAY_REMOVEPREFIX_METHODDEF
* bytearray_removesuffix        = BYTEARRAY_REMOVESUFFIX_METHODDEF
* bytearray_reverse             = BYTEARRAY_REVERSE_METHODDEF
b bytearray_rfind               = BYTEARRAY_RFIND_METHODDEF
b bytearray_rindex              = BYTEARRAY_RINDEX_METHODDEF
* bytearray_rpartition          = BYTEARRAY_RPARTITION_METHODDEF
s bytearray_rsplit              = BYTEARRAY_RSPLIT_METHODDEF
* bytearray_rstrip              = BYTEARRAY_RSTRIP_METHODDEF
s bytearray_split               = BYTEARRAY_SPLIT_METHODDEF
s bytearray_splitlines          = BYTEARRAY_SPLITLINES_METHODDEF
b bytearray_startswith          = BYTEARRAY_STARTSWITH_METHODDEF
* bytearray_strip               = BYTEARRAY_STRIP_METHODDEF
* bytearray_translate           = BYTEARRAY_TRANSLATE_METHODDEF

* bytearray_length              = sq_length
- PyByteArray_Concat            = sq_concat
* bytearray_repeat              = sq_repeat
* bytearray_getitem             = sq_item
* bytearray_setitem             = sq_ass_item
* bytearray_contains            = sq_contains
* bytearray_iconcat             = sq_inplace_concat
* bytearray_irepeat             = sq_inplace_repeat

* bytearray_length              =
* bytearray_subscript           =
* bytearray_ass_subscript       =

* bytearray_getbuffer           =
* bytearray_releasebuffer       =

* bytearray_mod                 =

- bytearray_dealloc             = tp_dealloc
* bytearray_repr                = tp_repr
- bytearray_str                 = tp_str
- bytearray_richcompare         = tp_richcompare
- bytearray_iter                = tp_iter
- bytearray___init__            = tp_init


* bytearrayiter_length_hint     = "__length_hint__"
* bytearrayiter_reduce          = "__reduce__"
* bytearrayiter_setstate        = "__setstate__"
- bytearrayiter_dealloc         =
- bytearrayiter_traverse        = tp_traverse
* bytearrayiter_next            = tp_iternext


- PyByteArray_FromObject        =
- PyByteArray_Concat            =
- PyByteArray_FromStringAndSize =
* PyByteArray_Size              =
- PyByteArray_AsString          =
* PyByteArray_Resize            =


w stringlib_capitalize          = {"capitalize", stringlib_capitalize, METH_NOARGS, _Py_capitalize__doc__},
w stringlib_center              = STRINGLIB_CENTER_METHODDEF
w stringlib_expandtabs          = STRINGLIB_EXPANDTABS_METHODDEF
w stringlib_isalnum             = {"isalnum", stringlib_isalnum, METH_NOARGS, _Py_isalnum__doc__},
w stringlib_isalpha             = {"isalpha", stringlib_isalpha, METH_NOARGS, _Py_isalpha__doc__},
w stringlib_isascii             = {"isascii", stringlib_isascii, METH_NOARGS, _Py_isascii__doc__},
w stringlib_isdigit             = {"isdigit", stringlib_isdigit, METH_NOARGS, _Py_isdigit__doc__},
w stringlib_islower             = {"islower", stringlib_islower, METH_NOARGS, _Py_islower__doc__},
w stringlib_isspace             = {"isspace", stringlib_isspace, METH_NOARGS, _Py_isspace__doc__},
w stringlib_istitle             = {"istitle", stringlib_istitle, METH_NOARGS, _Py_istitle__doc__},
w stringlib_isupper             = {"isupper", stringlib_isupper, METH_NOARGS, _Py_isupper__doc__},
w stringlib_ljust               = STRINGLIB_LJUST_METHODDEF
w stringlib_lower               = {"lower", stringlib_lower, METH_NOARGS, _Py_lower__doc__},
w stringlib_rjust               = STRINGLIB_RJUST_METHODDEF
w stringlib_swapcase            = {"swapcase", stringlib_swapcase, METH_NOARGS, _Py_swapcase__doc__},
w stringlib_title               = {"title", stringlib_title, METH_NOARGS, _Py_title__doc__},
w stringlib_upper               = {"upper", stringlib_upper, METH_NOARGS, _Py_upper__doc__},
w stringlib_zfill               = STRINGLIB_ZFILL_METHODDEF

Comments and guidance requested.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll yield to someone else, but this is one of the things that we probably shouldn't do with critical sections. bytearray is a hot object--we should keep performance in mind here. A lot (but not all!) of bytearray isn't prone to deadlocks because it doesn't typically re-enter.

I'm still undecided, but it might be worth borrowing from the FT list implementation for doing it locklessly.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Jan 21, 2025

this is one of the things that we probably shouldn't do with critical sections. bytearray is a hot object-- ...

In this I was just going from what I saw in list itself, even there list_item_impl() has a critical section. Though they do have that one with an #ifdef Py_GIL_DISABLED around it so that the critical macros aren't even there if not compiling with GIL disabled.

Py_BEGIN_CRITICAL_SECTION(self);

Though optimizations are of course on the table, first pass is just to get it correct and fix the crashes in the script in the issue for this PR.

EDIT: Though list does store whole objects and this stores bytes, so yeah, would be good to hear from someone with domain knowledge here.

EDIT: You made me curious so here are some very rough totally unscientific timings to compare (note is VM).

TL;DR:

GIL      |  main |    PR |
---------+-------+-------+
enabled  |  10.5 |  10.5 |
disabled |  11.7 |  14.9 |

Main branch gil disabled:

$ ./python
Python 3.14.0a4+ experimental free-threading build (heads/main:f3980af38b, Jan 20 2025, 22:11:16) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig; sysconfig.get_config_var('CONFIG_ARGS')
"'--enable-optimizations' '--with-lto' '--disable-gil'"
>>> from timeit import timeit
>>> b = bytearray(b'0')
>>> timeit('b[0]', number=1000000000, globals=globals())
11.69659708800009
>>> timeit('b[0]', number=1000000000, globals=globals())
11.63345341299987
>>> timeit('b[0]', number=1000000000, globals=globals())
11.640422253999986

This PR gil disabled:

$ ./python 
Python 3.14.0a4+ experimental free-threading build (heads/fix-issue-129107:832cc05259, Jan 20 2025, 22:18:24) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig; sysconfig.get_config_var('CONFIG_ARGS')
"'--enable-optimizations' '--with-lto' '--disable-gil'"
>>> from timeit import timeit
>>> b = bytearray(b'0')
>>> timeit('b[0]', number=1000000000, globals=globals())
14.824142644999938
>>> timeit('b[0]', number=1000000000, globals=globals())
14.943688319000103
>>> timeit('b[0]', number=1000000000, globals=globals())
14.88797294799997

Main branch gil enabled:

$ ./python
Python 3.14.0a4+ (heads/main:f3980af38b, Jan 21 2025, 06:14:12) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig; sysconfig.get_config_var('CONFIG_ARGS')
"'--enable-optimizations' '--with-lto'"
>>> from timeit import timeit
>>> b = bytearray(b'0')
>>> timeit('b[0]', number=1000000000, globals=globals())
10.562217106999924
>>> timeit('b[0]', number=1000000000, globals=globals())
10.573908387000074
>>> timeit('b[0]', number=1000000000, globals=globals())
10.467421004000016

This PR gil enabled:

$ ./python
Python 3.14.0a4+ (heads/fix-issue-129107:832cc05259, Jan 21 2025, 06:35:31) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig; sysconfig.get_config_var('CONFIG_ARGS')
"'--enable-optimizations' '--with-lto'"
>>> from timeit import timeit
>>> b = bytearray(b'0')
>>> timeit('b[0]', number=1000000000, globals=globals())
10.491293710999798
>>> timeit('b[0]', number=1000000000, globals=globals())
10.446074409000175
>>> timeit('b[0]', number=1000000000, globals=globals())
10.451200100999813

And just for a sanity check, my system vanilla py 3.10:

$ python
Python 3.10.12 (main, Nov  6 2024, 20:22:13) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig; sysconfig.get_config_var('CONFIG_ARGS')
"'--enable-shared' '--prefix=/usr' '--libdir=/usr/lib/x86_64-linux-gnu' '--enable-ipv6' '--enable-loadable-sqlite-extensions' '--with-dbmliborder=bdb:gdbm' '--with-computed-gotos' '--without-ensurepip' '--with-system-expat' '--with-dtrace' '--with-system-libmpdec' '--with-wheel-pkg-dir=/usr/share/python-wheels/' 'MKDIR_P=/bin/mkdir -p' '--with-system-ffi' 'CC=x86_64-linux-gnu-gcc' 'CFLAGS=-g       -fstack-protector-strong -Wformat -Werror=format-security ' 'LDFLAGS=-Wl,-Bsymbolic-functions      -g -fwrapv -O2   ' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'"
>>> from timeit import timeit
>>> b = bytearray(b'0')
>>> timeit('b[0]', number=1000000000, globals=globals())
18.547491659999878
>>> timeit('b[0]', number=1000000000, globals=globals())
18.55064152900013
>>> timeit('b[0]', number=1000000000, globals=globals())
18.568128662999925

@kumaraditya303
Copy link
Contributor

I suggest to split this PR into 2:

  • one to fix the bytearray object
  • one to fix the iterator of bytearray

@tom-pytel
Copy link
Contributor Author

I suggest to split this PR into 2:

Makes sense, iterator is smaller (and easier to backport?) so will break that out and send it separate up later today and leave this one as the main bytearray?

@kumaraditya303
Copy link
Contributor

Yup, SGTM, ping me for reviews.

@kumaraditya303
Copy link
Contributor

@colesbury This PR fixes thread safety of bytearray with critical section; do we want to have lock free reads for this too?

@colesbury
Copy link
Contributor

do we want to have lock free reads for this too?

No, probably not. At least not for now.

@kumaraditya303 kumaraditya303 enabled auto-merge (squash) February 15, 2025 07:13
@kumaraditya303 kumaraditya303 merged commit a05433f into python:main Feb 15, 2025
46 checks passed
@robsdedude
Copy link

Thank you very much for the continued efforts 🙇 Are there plans to backport this (and the follow-up PRs) to 3.13?

@tom-pytel
Copy link
Contributor Author

Thank you very much for the continued efforts 🙇 Are there plans to backport this (and the follow-up PRs) to 3.13?

As of now there are no plans to backport these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants