Skip to content

iter(range) shared by threads exceeds max range value #131199

Closed as duplicate of#129068
@ptmcg

Description

@ptmcg

Bug report

Bug description:

I was experimenting with atomic updates to containers and iterators across threads, and wrote this code. The GIL-enabled version does not have an issue, but the free-threaded version overruns the range iterator.
(Tested using CPython 3.14a05)

import sys

detect_gil = getattr(
    sys, '_is_gil_enabled', lambda: "no GIL detection available"
)
print(f"sys._is_gil_enabled() = {detect_gil()}")

import itertools
from threading import Thread, Barrier
from time import perf_counter


target_total = 800_000
num_workers = 8

# shared data - will access to these be thread-safe?
worker_id = itertools.count(1)
counter = iter(range(target_total))

# free-threaded code is _much_ faster with pre-allocated lists (over list.append or deque.append)
ints = [-1] * target_total
result_buffer = [None] * target_total

barrier = Barrier(num_workers)

def worker():
    this_thread = next(worker_id)

    # wait for all threads to have started before proceeding
    barrier.wait()

    # initialize buffer_index in case all the other thread exhaust counter
    # before this thread gets a chance to run
    buffer_index = 0

    # all threads share a common iterator
    for buffer_index in counter:

        ### THIS IS THE BAD PART - I SHOULDN"T HAVE TO DO THIS CHECK, SINCE counter IS AN 
        ### ITERATOR OVER range(target_total)
        if buffer_index >= target_total:
            # this shouldn't happen if counter is an iterator on a range
            break

        value = buffer_index + 1

        # increment the shared counter and add the result to the shared set WITHOUT locking
        ints[buffer_index] = value
        # ints.append(value)

        result_buffer[buffer_index] = (this_thread, value)
        # result_buffer.append((this_thread, value))

    if buffer_index >= target_total:
        ### THIS SHOULD NEVER HAPPEN, BUT IN THE OUPTUT YOU'LL SEE THAT IT DOES
        ### IN THE FREE-THREADED CASE
        print(f"iterator exceeded range max: {buffer_index=}, {len(ints)=} (shouldn't happen)")

threads = [Thread(target=worker) for _ in range(num_workers)]

for t in threads[:-1]:
    t.start()
input("Press Enter to start the threads")

start = perf_counter()

# starting the n'th thread releases the barrier
threads[-1].start()

for t in threads:
    t.join()
end = perf_counter()

print(">>>", end-start)

# see if ints are sorted - they don't have to be, if they are it just shows that
# the threads were able to safely increment the shared counter and append to the
# shared list
ints_are_sorted = all(
    a == b for a, b in zip(ints, range(1, target_total + 1))
)

# verify that no numbers are missing or duplicated (check expected sum of values 1-target_total)
assert sum(ints) == target_total * (target_total + 1) // 2

# Is the shared list sorted? If not, the counter increment
# and list append operations together are not thread-safe
# (as expected).
# (though passing the assert does imply thread-safety
# of each of the operations individually).
print("sorted?", ints_are_sorted)

# see how evenly the work was distributed across the thread
# workers
from collections import Counter
tally = Counter(i[0] for i in result_buffer)
for thread_id, count in tally.most_common():
    print(f"{thread_id:2d} {count:16,d}")
print(f"{sum(tally.values()):,}")

With GIL enabled:

sys._is_gil_enabled() = True
Press Enter to start the threads
>>> 0.5602053000038723
sorted? True
 1          147,616
 4          147,199
 7          108,055
 5          100,552
 6           99,720
 8           94,804
 2           72,038
 3           30,016
800,000

With free-threaded 3.14:

sys._is_gil_enabled() = False
Press Enter to start the threads
(pyparsing_3.13) PS C:\Users\ptmcg\dev\pyparsing\gh\pyparsing> py -3.14t "C:\Users\ptmcg\dev\pyparsing\gh\pyparsing\tests\nogil_multithreading_bug_report.py"
sys._is_gil_enabled() = False
Press Enter to start the threads
iterator exceeded range max: buffer_index=800003, len(ints)=800000 (shouldn't happen)iterator exceeded range max: buffer_index=800002, len(ints)=800000 (shouldn't happen)
iterator exceeded range max: buffer_index=800004, len(ints)=800000 (shouldn't happen)
iterator exceeded range max: buffer_index=800006, len(ints)=800000 (shouldn't happen)iterator exceeded range max: buffer_index=800007, len(ints)=800000 (shouldn't happen)


iterator exceeded range max: buffer_index=800000, len(ints)=800000 (shouldn't happen)

iterator exceeded range max: buffer_index=800005, len(ints)=800000 (shouldn't happen)iterator exceeded range max: buffer_index=800001, len(ints)=800000 (shouldn't happen)
>>> 3.4842117999942275
sorted? True
 5          138,725
 8          100,496
 3           95,411
 4           94,917
 6           94,825
 2           92,904
 1           92,339
 7           90,383
800,000

CPython versions tested on:

3.14

Operating systems tested on:

Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions