Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read and write gzip header and trailer with zlib #103477

Open
iii-i opened this issue Apr 12, 2023 · 2 comments
Open

Read and write gzip header and trailer with zlib #103477

iii-i opened this issue Apr 12, 2023 · 2 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@iii-i
Copy link
Contributor

iii-i commented Apr 12, 2023

Feature or enhancement

Replace manual gzip format handling with zlib's inflateGetHeader() and deflateSetHeader().

Pitch

RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer format anymore.

Previous discussion

https://discuss.python.org/t/read-and-write-gzip-header-and-trailer-with-zlib/25703/2

[1] madler/zlib#410

Linked PRs

@iii-i iii-i added the type-feature A feature request or enhancement label Apr 12, 2023
iii-i added a commit to iii-i/cpython that referenced this issue Apr 12, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer format anymore.

Add new interfaces to zlibmodule.c that make use of these functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
these new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

[1] madler/zlib#410
@arhadthedev arhadthedev added the stdlib Python modules in the Lib dir label Apr 12, 2023
iii-i added a commit to iii-i/cpython that referenced this issue Apr 12, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer format anymore.

Add new interfaces to zlibmodule.c that make use of these functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
these new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue May 31, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Jun 29, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Jun 29, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Jul 18, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

In addition to the new interfaces above, there is an additional
change in behavior that the users can see: for malformed gzip headers
and trailers, decompression now raises zlib.error instead of
BadGzipFile. However, this is allowed by today's spec.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Oct 9, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Oct 9, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate and inflate
performance on this platform by using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead on
s390x.

The reason is that Python needs to write specific values into gzip
header; and when this support was introduced in year 1997, there was
indeed no better way to do this.

Since v1.2.2.1 (2011) zlib provides inflateGetHeader() and
deflateSetHeader() functions for that, so Python does not have to deal
with the exact header and trailer formats anymore.

Add the new interfaces to zlibmodule.c that make use of these
functions:

* Add mtime argument to zlib.compress().
* Add mtime and fname arguments to zlib.compressobj().
* Add gz_header_mtime and gz_header_done propeties to ZlibDecompressor.

In Python modules, replace raw streams with gzip streams, make use of
the new interfaces, and remove all mentions of crc32.

📜🤖 NEWS entry added by blurb_it.

[1] madler/zlib#410
iii-i added a commit to iii-i/cpython that referenced this issue Nov 17, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress() and GzipFile. The performance improvement varies
depending on data being compressed, but it's in the ballpark of 40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

[1] madler/zlib#410
[2] python#103478
iii-i added a commit to iii-i/cpython that referenced this issue Nov 17, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
iii-i added a commit to iii-i/cpython that referenced this issue Nov 17, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
iii-i added a commit to iii-i/cpython that referenced this issue Nov 17, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
iii-i added a commit to iii-i/cpython that referenced this issue Nov 17, 2023
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
iii-i added a commit to iii-i/cpython that referenced this issue Jan 29, 2024
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
@serhiy-storchaka
Copy link
Member

On one hand, using the C code makes code faster, and it may help to solve #89672.

On other hand, see #112346. zlib implementation can produce different result. We should solve that issue first, and ensure that similar errors will not happen here.

iii-i added a commit to iii-i/cpython that referenced this issue Feb 28, 2024
RHEL, SLES and Ubuntu for IBM zSystems (aka s390x) ship with a zlib
optimization [1] that significantly improves deflate performance by
using a specialized CPU instruction.

This instruction not only compresses the data, but also computes a
checksum. At the moment Pyhton's gzip support performs compression and
checksum calculation separately, which creates unnecessary overhead.
The reason is that Python needs to write specific values into gzip
header, so it uses a raw stream instead of a gzip stream, and zlib
does not compute a checksum for raw streams.

The challenge with using gzip streams instead of zlib streams is
dealing with zlib-generated gzip header, which we need to rather
generate manually. Implement the method proposed by @rhpvorderman: use
Z_BLOCK on the first deflate() call in order to stop before the first
deflate block is emitted. The data that is emitted up until this point
is zlib-generated gzip header, which should be discarded.

Expose this new functionality by adding a boolean gzip_trailer argument
to zlib.compress() and zlib.compressobj(). Make use of it in
gzip.compress(), GzipFile and TarFile. The performance improvement
varies depending on data being compressed, but it's in the ballpark of
40%.

An alternative approach is to use the deflateSetHeader() function,
introduced in zlib v1.2.2.1 (2011). This also works, but the change
was deemed too intrusive [2].

📜🤖 Added by blurb_it.

[1] madler/zlib#410
[2] python#103478
@iii-i
Copy link
Contributor Author

iii-i commented Feb 28, 2024

@serhiy-storchaka, @rhpvorderman suggested that we should use a simpler approach to achieve the same result: instead of using the new C functions, let zlib generate both gzip header and gzip trailer, and strip the gzip header. I have implemented it here: #112199. It passes the #114116 test. Could you please take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
Status: No status
Development

No branches or pull requests

3 participants