Skip to content

gzip.GzipFile creates reference cycle that requires a deep garbage collection cycle to cleanup. #129640

Closed
@fowczarek

Description

@fowczarek

Bug report

Bug description:

During debugging memory buildup, I've noticed gzip.GzipFile holds a reference to itself (Cycle: GzipFile._buffer->BufferedWriter._raw->_WriteBufferStream.gzip_file-> GzipFile). This cycle prevents memory from being freed until the garbage collector runs a deep cleanup cycle (generation=2).

Steps to reproduce

  1. Disable garbage collection temporarily to make sure we are the ones who catch it
  2. Set the garbage collector's debug level to DEBUG_LEAK
  3. Open GzipFile.
  4. Force garbage collection and look at its output
import gc
import gzip
import io

gc.collect()
gc.disable()
gc.set_debug(gc.DEBUG_LEAK)

with io.BytesIO() as buffer:
    with gzip.GzipFile(mode="wb", fileobj=buffer):
        pass

gc.collect()
gc.set_debug(0)

Potential solution

class _WriteBufferStream(io.RawIOBase):
    ...

    def __del__(self):
        del self.gzip_file

CPython versions tested on:

3.12

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions