Skip to content

Commit

Permalink
pythongh-103477: Write gzip trailer with zlib
Browse files Browse the repository at this point in the history
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress() and GzipFile. The performance improvement varies
depending on data being compressed, but it's in the ballpark of 40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

[1] madler/zlib#410
[2] python#103478
  • Loading branch information
iii-i committed Nov 17, 2023
1 parent 985679f commit 91ef9c9
Show file tree
Hide file tree
Showing 3 changed files with 132 additions and 42 deletions.
19 changes: 7 additions & 12 deletions Lib/gzip.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,8 @@ def __init__(self, filename=None, mode=None,
zlib.DEFLATED,
-zlib.MAX_WBITS,
zlib.DEF_MEM_LEVEL,
0)
0,
gzip_trailer=True)
self._write_mtime = mtime
self._buffer_size = _WRITE_BUFFER_SIZE
self._buffer = io.BufferedWriter(_WriteBufferStream(self),
Expand All @@ -245,8 +246,6 @@ def __repr__(self):

def _init_write(self, filename):
self.name = filename
self.crc = zlib.crc32(b"")
self.size = 0
self.writebuf = []
self.bufsize = 0
self.offset = 0 # Current file offset for seek(), tell(), etc
Expand Down Expand Up @@ -310,8 +309,6 @@ def _write_raw(self, data):

if length > 0:
self.fileobj.write(self.compress.compress(data))
self.size += length
self.crc = zlib.crc32(data, self.crc)
self.offset += length

return length
Expand Down Expand Up @@ -355,9 +352,6 @@ def close(self):
if self.mode == WRITE:
self._buffer.flush()
fileobj.write(self.compress.flush())
write32u(fileobj, self.crc)
# self.size may exceed 2 GiB, or even 4 GiB
write32u(fileobj, self.size & 0xffffffff)
elif self.mode == READ:
self._buffer.close()
finally:
Expand Down Expand Up @@ -611,10 +605,11 @@ def compress(data, compresslevel=_COMPRESS_LEVEL_BEST, *, mtime=None):
# This is faster and with less overhead.
return zlib.compress(data, level=compresslevel, wbits=31)
header = _create_simple_gzip_header(compresslevel, mtime)
trailer = struct.pack("<LL", zlib.crc32(data), (len(data) & 0xffffffff))
# Wbits=-15 creates a raw deflate block.
return (header + zlib.compress(data, level=compresslevel, wbits=-15) +
trailer)
# Wbits=-15 creates a raw deflate block. Gzip_trailer=True computes CRC32
# and writes gzip trailer with zlib, which on some platforms is faster
# than doing it manually.
return (header + zlib.compress(data, level=compresslevel, wbits=-15,
gzip_trailer=True))


def decompress(data):
Expand Down
71 changes: 49 additions & 22 deletions Modules/clinic/zlibmodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 91ef9c9

Please sign in to comment.