bpo-47012: speed up iteration of bytes and bytearray #31867

kumaraditya303 · 2022-03-14T10:54:27Z

Benchmark:

from pyperf import Runner, perf_counter

def bench_bytes(loops, length):
    src = b'helloworld' * length
    t0 = perf_counter()
    for _ in range(loops):
        for i in src:
            pass
    return perf_counter() - t0

def bench_bytearray(loops, length):
    src = bytearray(b'hello' * length)
    t0 = perf_counter()
    for _ in range(loops):
        for i in src:
            pass
    return perf_counter() - t0

runner = Runner()
for n in [10_000, 100_000]:
    runner.bench_time_func(f"bytes {n}", bench_bytes, n)
    runner.bench_time_func(f"bytearray {n}", bench_bytearray, n)

Results:

bytes 10000: Mean +- std dev: [base] 829 us +- 38 us -> [patch] 677 us +- 44 us: 1.23x faster
bytearray 10000: Mean +- std dev: [base] 523 us +- 34 us -> [patch] 360 us +- 19 us: 1.45x faster
bytes 100000: Mean +- std dev: [base] 8.33 ms +- 0.38 ms -> [patch] 6.89 ms +- 0.75 ms: 1.21x faster
bytearray 100000: Mean +- std dev: [base] 5.19 ms +- 0.23 ms -> [patch] 3.61 ms +- 0.23 ms: 1.44x faster

Geometric mean: 1.33x faster

https://bugs.python.org/issue47012

Include/internal/pycore_long.h

sweeneyde · 2022-03-19T19:33:39Z

I agree with @animalize that it would be safest to include a preprocessor directive, whether that's with separate #if _PY_NSMALLPOSINTS > 255/#else code, or with a #error directive.

I'm not sure why anyone would compile with fewer than 256 cached small ints. Here, @vstinner added the code

// _PyLong_GetZero() and _PyLong_GetOne() must always be available
#if _PY_NSMALLPOSINTS < 2
#  error "_PY_NSMALLPOSINTS must be greater than 1"
#endif

@vstinner, would there be any downside to requiring all of (0, 1, ..., 255) be in the small int cache?

vstinner · 2022-03-22T13:08:29Z

I don't think that anyone ever tuned _PY_NSMALLPOSINTS. The value should be hardcoded. But just for sanity, you can add a static_assert() in code which makes assumptions about its value, just in case if someone changes _PY_NSMALLPOSINTS in the future. For example, I added assertions to ensure that 0 and 1 singletons always exist.

Include/internal/pycore_long.h

kumaraditya303 · 2022-03-23T07:19:17Z

@sweeneyde I have added the compilation guard, but FYI if _PY_NSMALLPOSINTS is changed then it would break deepfreeze and module freezing infra so it is not configurable not to mention that it is declared in internal header.

speed up iteration of bytes and bytearray

0fa5b97

bedevere-bot added the awaiting review label Mar 14, 2022

the-knights-who-say-ni added the CLA signed label Mar 14, 2022

kumaraditya303 changed the title ~~speed up iteration of bytes and bytearray~~ bpo-47012: speed up iteration of bytes and bytearray Mar 14, 2022

kumaraditya303 marked this pull request as ready for review March 14, 2022 11:14

blurb-it bot and others added 3 commits March 14, 2022 11:15

📜🤖 Added by blurb_it.

012f083

Update bytearrayobject.c

d738245

Update Objects/bytearrayobject.c

4fcc8bb

eendebakpt mentioned this pull request Mar 14, 2022

Use _PyLong_FromUnsignedChar kumaraditya303/cpython#1

Closed

thatbirdguythatuknownot reviewed Mar 15, 2022

View reviewed changes

Include/internal/pycore_long.h Show resolved Hide resolved

vstinner reviewed Mar 22, 2022

View reviewed changes

Include/internal/pycore_long.h Show resolved Hide resolved

kumaraditya303 added 3 commits March 23, 2022 07:11

Merge branch 'main' of https://github.com/python/cpython into speed-iter

3c1a8a5

add compile guard

9e1978e

add comment

921e422

sweeneyde approved these changes Mar 23, 2022

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Mar 23, 2022

sweeneyde merged commit bd1cf6e into python:main Mar 23, 2022

bedevere-bot removed the awaiting merge label Mar 23, 2022

eendebakpt mentioned this pull request Mar 25, 2022

bpo-47116: use _PyLong_FromUnsignedChar instead of PyLong_FromLong #32110

Merged

kumaraditya303 deleted the speed-iter branch August 16, 2022 14:45

sir-sigurd mentioned this pull request Aug 27, 2022

add internal _PyLong_FromUnsignedChar() function #82018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-47012: speed up iteration of bytes and bytearray #31867

bpo-47012: speed up iteration of bytes and bytearray #31867

kumaraditya303 commented Mar 14, 2022 •

edited by bedevere-bot

Loading

sweeneyde commented Mar 19, 2022

vstinner commented Mar 22, 2022

kumaraditya303 commented Mar 23, 2022

bpo-47012: speed up iteration of bytes and bytearray #31867

bpo-47012: speed up iteration of bytes and bytearray #31867

Conversation

kumaraditya303 commented Mar 14, 2022 • edited by bedevere-bot Loading

sweeneyde commented Mar 19, 2022

vstinner commented Mar 22, 2022

kumaraditya303 commented Mar 23, 2022

kumaraditya303 commented Mar 14, 2022 •

edited by bedevere-bot

Loading