pythongh-103477: Write gzip trailer with zlib

RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib optimization [1] that significantly improves deflate performance by using a specialized CPU instruction. This instruction not only compresses the data, but also computes a checksum. At the moment Pyhton's gzip support performs compression and checksum calculation separately, which creates unnecessary overhead. The reason is that Python needs to write specific values into gzip header, so it uses a raw stream instead of a gzip stream, and zlib does not compute a checksum for raw streams. The challenge with using gzip streams instead of zlib streams is dealing with zlib-generated gzip header, which we need to rather generate manually. Implement the method proposed by @rhpvorderman: use Z_BLOCK on the first deflate() call in order to stop before the first deflate block is emitted. The data that is emitted up until this point is zlib-generated gzip header, which should be discarded. Expose this new functionality by adding a boolean gzip_trailer argument to zlib.compress() and zlib.compressobj(). Make use of it in gzip.compress() and GzipFile. The performance improvement varies depending on data being compressed, but it's in the ballpark of 40%. An alternative approach is to use the deflateSetHeader() function, introduced in zlib v1.2.2.1 (2011). This also works, but the change was deemed too intrusive [2]. [1] madler/zlib#410 [2] python#103478
iii-i · Nov 17, 2023 · 91ef9c9 · 91ef9c9
1 parent 985679f
commit 91ef9c9
Show file tree

Hide file tree

Showing 3 changed files with 132 additions and 42 deletions.
diff --git a/Lib/gzip.py b/Lib/gzip.py
@@ -221,7 +221,8 @@ def __init__(self, filename=None, mode=None,
                                              zlib.DEFLATED,
                                              -zlib.MAX_WBITS,
                                              zlib.DEF_MEM_LEVEL,
-                                             0)
+                                             0,
+                                             gzip_trailer=True)
             self._write_mtime = mtime
             self._buffer_size = _WRITE_BUFFER_SIZE
             self._buffer = io.BufferedWriter(_WriteBufferStream(self),
@@ -245,8 +246,6 @@ def __repr__(self):
 
     def _init_write(self, filename):
         self.name = filename
-        self.crc = zlib.crc32(b"")
-        self.size = 0
         self.writebuf = []
         self.bufsize = 0
         self.offset = 0  # Current file offset for seek(), tell(), etc
@@ -310,8 +309,6 @@ def _write_raw(self, data):
 
         if length > 0:
             self.fileobj.write(self.compress.compress(data))
-            self.size += length
-            self.crc = zlib.crc32(data, self.crc)
             self.offset += length
 
         return length
@@ -355,9 +352,6 @@ def close(self):
             if self.mode == WRITE:
                 self._buffer.flush()
                 fileobj.write(self.compress.flush())
-                write32u(fileobj, self.crc)
-                # self.size may exceed 2 GiB, or even 4 GiB
-                write32u(fileobj, self.size & 0xffffffff)
             elif self.mode == READ:
                 self._buffer.close()
         finally:
@@ -611,10 +605,11 @@ def compress(data, compresslevel=_COMPRESS_LEVEL_BEST, *, mtime=None):
         # This is faster and with less overhead.
         return zlib.compress(data, level=compresslevel, wbits=31)
     header = _create_simple_gzip_header(compresslevel, mtime)
-    trailer = struct.pack("<LL", zlib.crc32(data), (len(data) & 0xffffffff))
-    # Wbits=-15 creates a raw deflate block.
-    return (header + zlib.compress(data, level=compresslevel, wbits=-15) +
-            trailer)
+    # Wbits=-15 creates a raw deflate block. Gzip_trailer=True computes CRC32
+    # and writes gzip trailer with zlib, which on some platforms is faster
+    # than doing it manually.
+    return (header + zlib.compress(data, level=compresslevel, wbits=-15,
+                                   gzip_trailer=True))
 
 
 def decompress(data):

diff --git a/Modules/clinic/zlibmodule.c.h b/Modules/clinic/zlibmodule.c.h