Skip to content

bpo-30576 : Add HTTP compression support to http.server #2078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 67 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c9b880a
Code and documentation for HTTP compression support in http.server
Jun 9, 2017
481f69e
Merge remote-tracking branch 'upstream2/master'
Jun 9, 2017
5a3693e
Implementation change : if original file is big, store zipped content…
Jun 9, 2017
bed3047
Resolve conflict
Jun 9, 2017
d82b7cd
resolve conflict in NEWS
Jun 10, 2017
e9deb89
For gzipped content on disk, set file pointer at position 0.
Jun 10, 2017
39da21b
Merge branch 'master' into master
PierreQuentel Jun 11, 2017
f9963f0
Merge remote-tracking branch 'upstream2/master'
Jun 12, 2017
ace7999
resolve conflict in NEWS
Jun 12, 2017
128f6a6
Fix conflicts
Jun 12, 2017
586b8c3
Add HTTP compression test for big files
Jun 12, 2017
a61109e
Merge branch 'master' into master
PierreQuentel Jun 15, 2017
6093cae
Merge branch 'master' into master
PierreQuentel Jun 17, 2017
b27144a
Support more Accept-Encoding values, eg "*", "gzip;q=0.5", "GZIP"
Jun 18, 2017
3525f56
Merge remote-tracking branch 'upstream2/master'
Jun 18, 2017
1ad8139
Merge branch 'master' of https://github.com/PierreQuentel/cpython
Jun 18, 2017
8a916fc
More documentation
Jun 18, 2017
6f193f9
Remove trailing whitespace
Jun 18, 2017
655f753
Handle the case when storing the temporary gzip file fails, eg for la…
Jun 19, 2017
2433769
Add comment.
Jun 24, 2017
cfdcf93
Remove changes to mimetypes.py
Jul 1, 2017
9d8e668
Restore tests in test_httpservers, remove test with .json extension
Jul 1, 2017
b700014
Remove whitespaces
Jul 1, 2017
6a31d60
Remove whitespaces
Jul 1, 2017
9830b89
Remove whitespaces
Jul 1, 2017
6466c93
Merge remote-tracking branch 'upstream2/master'
Jul 25, 2017
89de0fe
Disable HTTP compression by default. Add command line option --gzip t…
Jul 25, 2017
211bf66
Remove trailing whitespace
Jul 25, 2017
a7f1f47
Only apply chunk transfer for HTTP/1.1 ; change implementation of com…
Jul 27, 2017
79cec36
Simplify code for HTTP compression
Jul 28, 2017
83f6082
Add tests on presence of Transfer-Encoding header
Jul 28, 2017
329d5f8
Merge remote-tracking branch 'upstream2/master'
Jul 28, 2017
2ca373f
Add entry in NEWS.d
Jul 28, 2017
8fbc454
Merge remote-tracking branch 'upstream2/master'
Jul 28, 2017
e51cb95
Merge remote-tracking branch 'upstream2/master'
Aug 2, 2017
cfb599d
Remove variable has_gzip, set gzip to None in case of ImportError. Us…
Aug 2, 2017
92a47f7
Merge remote-tracking branch 'upstream2/master'
Aug 13, 2017
75b78a1
By default, "deflate" is supported besides "gzip" ; other compression…
Aug 13, 2017
5db66e0
Add test for user-defined compression (bzip2).
Aug 13, 2017
059a745
Remove trailing whitespace.
Aug 13, 2017
c099836
Merge remote-tracking branch 'upstream2/master'
Sep 13, 2017
396866d
Update http.server documentation
Sep 15, 2017
553c920
Merge remote-tracking branch 'upstream2/master'
Sep 15, 2017
9fdecb1
Resolve conflict
Sep 15, 2017
f0293fc
Restore Lib/test/bisect.py
Sep 15, 2017
fc2d9ca
Remove Misc/NEWS
Sep 15, 2017
baf88e4
Remove unused import (tempfile)
Oct 1, 2017
2bce4ef
Replace "as argument" by "as an argument"
Oct 1, 2017
1bdbb3a
Remove unused import (gzip)
Oct 1, 2017
27d02bd
If zlib is not available, set SimpleHTTPRequestHandler.compressions t…
Oct 1, 2017
ca68881
Replace sorted(...)[-1] by max(...)
Oct 1, 2017
d6157d4
For shutil.rmtree, instead of a try/except, set argument ignore_error…
Oct 1, 2017
a456fe6
Replace assertTrue(a in b) by assertIn(a, b) and assertFalse(a in b) …
Oct 1, 2017
50f0a85
Remove default_request_version
Oct 1, 2017
87a2cc3
Merge remote-tracking branch 'upstream2/master'
Oct 1, 2017
b9f9599
Restore Lib/test/bisect.py
Oct 1, 2017
7881e1f
Compressed data generators may send empty bytes ; adapt do_GET() for …
Oct 1, 2017
2332d91
Remove trace
Oct 1, 2017
70f4738
If Accept-Encoding is set to *, use one of the supported compressions…
Oct 1, 2017
d44c8bf
Minor changes in comments and code formatting.
Oct 1, 2017
d844ca6
Adapt test with Accept-Encoding set to "*"
Oct 1, 2017
634f572
Merge remote-tracking branch 'upstream2/master'
Oct 11, 2017
7cfa86e
Handle missing zlib or bz2 modules
Oct 11, 2017
829b992
Replace command line option "--gzip" by "--compressed"
Oct 11, 2017
383ee33
Improve documentation for attribute "compressions" ; replace command …
Oct 11, 2017
9d04caa
Merge branch 'master' into master
PierreQuentel Apr 16, 2018
71e9090
Merge branch 'master' into master
PierreQuentel Jul 30, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 38 additions & 3 deletions Doc/library/http.server.rst
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,6 @@ provides three different variants:
the message to :meth:`log_message`, so it takes the same arguments
(*format* and additional values).


.. method:: log_message(format, ...)

Logs an arbitrary message to ``sys.stderr``. This is typically overridden
Expand Down Expand Up @@ -338,6 +337,29 @@ provides three different variants:

If not specified, the directory to serve is the current working directory.

.. attribute:: compressed_types

The list of content types for which HTTP compression is applied. Set by
default to the empty list, which means that by default, compression is
disabled. It can be set to a list of types, for instance :
``["text/plain", "text/html", "text/css"]``. A list of commonly
compressed types is provided as ``commonly_compressed_types`` at the
module level.

.. versionadded:: 3.7

.. attribute:: compressions

A mapping between a compression encoding (eg. "gzip") and a generator
that takes a file object as an argument, reads the file content by
chunks, compresses each chunk with the specified encoding, and yields
the compressed data as a bytes object.
By default, if the :mod:`zlib` module is available, "gzip" and "deflate"
compressions are supported. To support other algorithms,
:attr:`compressions` can be extended.

.. versionadded:: 3.7

The :class:`SimpleHTTPRequestHandler` class defines the following methods:

.. method:: do_HEAD()
Expand Down Expand Up @@ -366,6 +388,12 @@ provides three different variants:
type is guessed by calling the :meth:`guess_type` method, which in turn
uses the *extensions_map* variable, and the file contents are returned.

If the content type is in the list ``compressed_types``, and if the
user agent has sent an ``'Accept-Encoding'`` header that included
"gzip", a header ``'Content-Encoding'`` set to "gzip" is sent and the
file content is compressed using gzip. For big files, the gzipped
content is stored in a temporary file.

A ``'Content-type:'`` header with the guessed content type is output,
followed by a ``'Content-Length:'`` header with the file's size and a
``'Last-Modified:'`` header with the file's modification time.
Expand All @@ -377,8 +405,9 @@ provides three different variants:
For example usage, see the implementation of the :func:`test` function
invocation in the :mod:`http.server` module.

.. versionchanged:: 3.7
Support of the ``'If-Modified-Since'`` header.
.. versionadded:: 3.7
Support of the ``'If-Modified-Since'`` header and of HTTP
compression.

The :class:`SimpleHTTPRequestHandler` class can be used in the following
manner in order to create a very basic webserver serving files relative to
Expand Down Expand Up @@ -421,6 +450,12 @@ the following command uses a specific directory::
.. versionadded:: 3.7
``--directory`` specify alternate directory

By default, HTTP compression is not supported. Setting the option
``--compressed`` enables compression on the content types defined in
``commonly_compressed_types``.

.. versionadded:: 3.7

.. class:: CGIHTTPRequestHandler(request, client_address, server)

This class is used to serve either files or output of CGI scripts from the
Expand Down
14 changes: 14 additions & 0 deletions Doc/whatsnew/3.7.rst
Original file line number Diff line number Diff line change
Expand Up @@ -962,6 +962,16 @@ With this parameter, the server serves the specified directory, by default it
uses the current working directory.
(Contributed by Stéphane Wirtel and Julien Palard in :issue:`28707`.)

Support HTTP compression : if the user agent sends a request with header
``Accept-Encoding`` including gzip, and if the content type guessed from the
file extension is in the attribute ``compressed_types`` of
:class:`~http.server.SimpleHTTPRequestHandler`, set the response header
``Content-Encoding`` to "gzip", set ``Content-Length`` to the length of the
gzipped content, and return a file object with the gzipped content.
(Contributed by Pierre Quentel in :issue:`30576`.)

hmac
=======
The new :class:`ThreadingHTTPServer <http.server.ThreadingHTTPServer>` class
uses threads to handle requests using :class:`~socketserver.ThreadingMixin`.
It is used when ``http.server`` is run with ``-m``.
Expand Down Expand Up @@ -1085,6 +1095,7 @@ configuration passed to :func:`logging.config.fileConfig`.


math

----

The new :func:`math.remainder` function implements the IEEE 754-style remainder
Expand All @@ -1099,7 +1110,10 @@ The MIME type of .bmp has been changed from ``'image/x-ms-bmp'`` to
(Contributed by Nitish Chandra in :issue:`22589`.)


locale
=======
msilib

------

The new :meth:`Database.Close() <msilib.Database.Close>` method can be used
Expand Down
173 changes: 171 additions & 2 deletions Lib/http/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@
import email.utils
import html
import http.client
import http.cookiejar
import io
import mimetypes
import os
Expand All @@ -107,6 +108,11 @@

from http import HTTPStatus

# Python might be built without zlib
try:
import zlib
except ImportError:
zlib = None

# Default error message template
DEFAULT_ERROR_MESSAGE = """\
Expand All @@ -128,6 +134,71 @@

DEFAULT_ERROR_CONTENT_TYPE = "text/html;charset=utf-8"

# List of commonly compressed content types, copied from
# https://github.com/h5bp/server-configs-apache.
# SimpleHTTPRequestHandler.compressed_types is set to this list when the
# server is started with command line option --gzip.
commonly_compressed_types = [
"application/atom+xml",
"application/javascript",
"application/json",
"application/ld+json",
"application/manifest+json",
"application/rdf+xml",
"application/rss+xml",
"application/schema+json",
"application/vnd.geo+json",
"application/vnd.ms-fontobject",
"application/x-font-ttf",
"application/x-javascript",
"application/x-web-app-manifest+json",
"application/xhtml+xml",
"application/xml",
"font/eot",
"font/opentype",
"image/bmp",
"image/svg+xml",
"image/vnd.microsoft.icon",
"image/x-icon",
"text/cache-manifest",
"text/css",
"text/html",
"text/javascript",
"text/plain",
"text/vcard",
"text/vnd.rim.location.xloc",
"text/vtt",
"text/x-component",
"text/x-cross-domain-policy",
"text/xml"
]

# Generators for HTTP compression

def _zlib_producer(fileobj, wbits):
"""Generator that yields data read from the file object fileobj,
compressed with the zlib library.
wbits is the same argument as for zlib.compressobj.
"""
bufsize = 2 << 17
producer = zlib.compressobj(wbits=wbits)
with fileobj:
while True:
buf = fileobj.read(bufsize)
if not buf: # end of file
yield producer.flush()
return
yield producer.compress(buf)

def _gzip_producer(fileobj):
"""Generator for gzip compression."""
return _zlib_producer(fileobj, 25)

def _deflate_producer(fileobj):
"""Generator for deflage compression."""
return _zlib_producer(fileobj, 15)


class HTTPServer(socketserver.TCPServer):

allow_reuse_address = 1 # Seems to make sense in testing environment
Expand Down Expand Up @@ -639,6 +710,22 @@ class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):

server_version = "SimpleHTTP/" + __version__

# List of Content Types that are returned with HTTP compression.
# Set to the empty list by default (no compression).
compressed_types = []

# Dictionary mapping an encoding (in an Accept-Encoding header) to a
# generator of compressed data. By default, provided zlib is available,
# the supported encodings are gzip and deflate.
# Override if a subclass wants to use other compression algorithms.
compressions = {}
if zlib:
compressions = {
'deflate': _deflate_producer,
'gzip': _gzip_producer,
'x-gzip': _gzip_producer # alias for gzip
}

def __init__(self, *args, directory=None, **kwargs):
if directory is None:
directory = os.getcwd()
Expand All @@ -650,7 +737,19 @@ def do_GET(self):
f = self.send_head()
if f:
try:
self.copyfile(f, self.wfile)
if hasattr(f, "read"):
self.copyfile(f, self.wfile)
else:
# Generator for compressed data
if self.protocol_version >= "HTTP/1.1":
# Chunked Transfer
for data in f:
if data:
self.wfile.write(self._make_chunk(data))
self.wfile.write(self._make_chunk(b''))
else:
for data in f:
self.wfile.write(data)
finally:
f.close()

Expand All @@ -660,6 +759,10 @@ def do_HEAD(self):
if f:
f.close()

def _make_chunk(self, data):
"""Make a data chunk in Chunked Transfer Encoding format."""
return f"{len(data):X}".encode("ascii") + b"\r\n" + data + b"\r\n"

def send_head(self):
"""Common code for GET and HEAD commands.

Expand Down Expand Up @@ -700,6 +803,7 @@ def send_head(self):

try:
fs = os.fstat(f.fileno())
content_length = fs[6]
# Use browser cache if possible
if ("If-Modified-Since" in self.headers
and "If-None-Match" not in self.headers):
Expand Down Expand Up @@ -730,9 +834,67 @@ def send_head(self):

self.send_response(HTTPStatus.OK)
self.send_header("Content-type", ctype)
self.send_header("Content-Length", str(fs[6]))
self.send_header("Last-Modified",
self.date_time_string(fs.st_mtime))

if ctype not in self.compressed_types:
self.send_header("Content-Length", str(content_length))
self.end_headers()
return f

# Use HTTP compression if possible

# Get accepted encodings ; "encodings" is a dictionary mapping
# encodings to their quality ; eg for header "gzip; q=0.8",
# encodings["gzip"] is set to 0.8
accept_encoding = self.headers.get_all("Accept-Encoding", ())
encodings = {}
for accept in http.cookiejar.split_header_words(accept_encoding):
params = iter(accept)
encoding = next(params, ("", ""))[0]
quality, value = next(params, ("", ""))
if quality == "q" and value:
try:
q = float(value)
except ValueError:
# Invalid quality : ignore encoding
q = 0
else:
q = 1 # quality defaults to 1
if q:
encodings[encoding] = max(encodings.get(encoding, 0), q)

compressions = set(encodings).intersection(self.compressions)
compression = None
if compressions:
# Take the encoding with highest quality
compression = max((encodings[enc], enc)
for enc in compressions)[1]
elif '*' in encodings and self.compressions:
# If no specified encoding is supported but "*" is accepted,
# take one of the available compressions.
compression = list(self.compressions)[0]
if compression:
# If at least one encoding is accepted, send data compressed
# with the selected compression algorithm.
producer = self.compressions[compression]
self.send_header("Content-Encoding", compression)
if content_length < 2 << 18:
# For small files, load content in memory
with f:
content = b''.join(producer(f))
content_length = len(content)
f = io.BytesIO(content)
else:
chunked = self.protocol_version >= "HTTP/1.1"
if chunked:
# Use Chunked Transfer Encoding (RFC 7230 section 4.1)
self.send_header("Transfer-Encoding", "chunked")
self.end_headers()
# Return a generator of pieces of compressed data
return producer(f)

self.send_header("Content-Length", str(content_length))
self.end_headers()
return f
except:
Expand Down Expand Up @@ -1249,14 +1411,21 @@ def test(HandlerClass=BaseHTTPRequestHandler,
parser.add_argument('--directory', '-d', default=os.getcwd(),
help='Specify alternative directory '
'[default:current directory]')
parser.add_argument('--compressed', action="store_true",
help="Enable HTTP compression")
parser.add_argument('port', action='store',
default=8000, type=int,
nargs='?',
help='Specify alternate port [default: 8000]')
args = parser.parse_args()
if args.cgi:
handler_class = CGIHTTPRequestHandler
elif args.compressed and zlib:
class GzipHandler(SimpleHTTPRequestHandler):
compressed_types = commonly_compressed_types
handler_class = GzipHandler
else:
handler_class = partial(SimpleHTTPRequestHandler,
directory=args.directory)

test(HandlerClass=handler_class, port=args.port, bind=args.bind)
Loading