Skip to content

Deferred reference counts #120024

Open
Open
@markshannon

Description

@markshannon

Feature or enhancement

Proposal:

Approximately 80% of reference count operations occur in the interpreter. stats
The vast majority of these operations are needed only to maintain the correct reference counts for references in local variables and the evaluation stack.
We should not count these references, instead deferring them until we wish to perform incremental collection.

Doing so will give us a reasonable speedup on default builds, but the real value is for free-threading.
Free-threading requires that some references on the frame stack are deferred, but tagging those references is expensive. It is much more efficient to simply deferred all references on the frame stack.

This is not a new idea, in fact it is a very old one.

The implementation is conceptually fairly simple:

  • We don't count references in local variables and the evaluation stack
  • Any object that has a reference count of zero is, instead of being reclaimed, added to a "Zero count table"
  • When we perform collection, we update the reference count of all objects that have references on the stack, collect any objects with a zero reference count, and then reset the reference counts.

Like many "simple" ideas, the devil is in the detail.

There are two main concerns:

  • Reclamation of objects is not as prompt as before. We may use more resources, waiting for them to be reclaimed.
  • The overhead of updating references during collections may be as great or greater than the saving by deferring the reference counting.

We can keep reclamation acceptably prompt, by tracking the size of objects in the Zero Count Table, the size of objects allocated.
Once this number gets large enough, we perform a collection at the next opportunity.

We can keep the overhead of updating the reference counts low, by only deferring the top of the stack. Parts of the stack that are not accessed between collections, can be counted eagerly, reducing the amount of scanning needed to a few frames.

Previous discussion

faster-cpython/ideas#677

Linked PRs

### Tasks
- [ ] https://github.com/python/cpython/issues/123391
- [x] Interpreter code generators need to be able to flush the stack around escaping calls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions