Skip to content

Add delayed reclamation mechanism for free-threaded build (QSBR) #115103

Closed
@colesbury

Description

@colesbury

Feature or enhancement

Many operations in the free-threaded build are protected by locks. However, in some cases, we want to allow reads to happen concurrently with updates 1. For example, we want to avoid locking during most list read accesses. If there is a concurrent list resize operation, when is it safe to free the list's previous array?

"Safe memory reclamation" (SMR) schemes address this usually by delaying the "free" operation until all concurrent read accesses are guaranteed to have completed. Quiescent state-based reclamation (QSBR) is a safe memory reclamation scheme that will be useful in the free-threaded build. QSBR relies on marked "quiescent states", when a thread reports that it is definitely not in the middle of a non-locking read operation. The eval breaker checks serve as a good place to report a quiescent state.

Background

SMR schemes are most common in systems that don't use garbage collection or reference counting. CPython uses reference counting and has a tracing garbage collector, so it may seem like an odd fit. However, some data structures are not reference counted. For example, a list object's array is not separately reference counted, but may have a shorter lifetime than the containing PyListObject. We could delay reclamation until the GC runs, but we want reclamation to happen quickly, and we generally want to run the GC less frequently in the free-threaded build, because it requires pausing all threads and, at least for now, scanning the entire heap.

Use cases

  • Dict keys (PyDictKeysObject) and list arrays (ob_item): If we resize a dict or list that may be shared between threads, we use QSBR to delay freeing the old keys/array until it's safe to do so. Note that most dicts and lists are not accessed by multiple threads; their keys/arrays can be immediately freed on resize.
  • Mimalloc mi_page_t: The non-locking dict and list accesses require cooperation from mimalloc. We want to ensure that even if an item is freed during access and the memory reused for a new object, the new object’s reference count field is placed at the same location in memory. In practice, this means that when an mi_page_t is empty, we don't immediately allow it to be re-used for allocations of a different size class. We use QSBR to determine when it's safe to use the mi_page_t for a different size class (or return the memory to the OS).

Implementation Overview

The implementation will be partially based on FreeBSD's "Global Unbounded Sequences". The FreeBSD implementation provides a good starting point because it's relatively simple, self contained, and the license (2-Clause BSD) is compatible with CPython. The implementation relies on a few sequence counters:

  • Global (per-interpreter) write sequence: When we want to safely free a data structure, we increment the global write sequence counter and tag the data to be freed with the new value.
  • Per-thread state read sequence: When a thread reaches a quiescent state (such as when it checks the eval breaker), it copies the global write sequence to its local read counter.

It's safe to free data once all thread states' read sequences are greater than or equal to the write sequence used to tag the data.

To check that, we scan over all the thread states read sequences to compute the minimum value (excluding detached threads). For efficiency, we also cache that computed value globally (per-interpreter).

Limitations

Determining the current minimum read sequence requires scanning over all thread states. This will likely become a bottleneck if we have a large number of threads (>1,000?). We will likely eventually want to address this, possibly with some combination of a tree-based mechanism 2 and incremental scanning.

For now, I think keeping the implementation simple important until we are able to run and benchmark multithreaded programs with the GIL disabled.

Linked PRs

Footnotes

  1. See "Optimistically Avoiding Locking" in PEP 703.

  2. https://people.kernel.org/joelfernandes/gus-vs-rcu

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions