diff --git a/http-web-services.html b/http-web-services.html index f07ac409..9652561b 100755 --- a/http-web-services.html +++ b/http-web-services.html @@ -54,7 +54,7 @@
HTTP is designed with caching in mind. There is an entire class of devices (called “caching proxies”) whose only job is to sit between you and the rest of the world and minimize network access. Your company or ISP almost certainly maintains caching proxies, even if you’re unaware of them. They work because caching built into the HTTP protocol. +
HTTP is designed with caching in mind. There is an entire class of devices (called “caching proxies”) whose only job is to sit between you and the rest of the world and minimize network access. Your company or ISP almost certainly maintains caching proxies, even if you’re unaware of them. They work because caching is built into the HTTP protocol.
Here’s a concrete example of how caching works. You visit diveintomark.org
in your browser. That page includes a background image, wearehugh.com/m.jpg
. When your browser downloads that image, the server includes the following HTTP headers:
@@ -264,10 +264,10 @@
ETag
header.
Content-encoding
header. Your request stated that you only accept uncompressed data (Accept-encoding: identity
), and sure enough, this response contains uncompressed data.
response.read()
. As you can tell from the len()
function, this downloads all 3070 bytes at once.
+response.read()
. As you can tell from the len()
function, this fetched a total of 3070 bytes.
-As you can see, this code is already inefficient: it asked for (and received) uncompressed data. I know for a fact that this server supports gzip compression, but HTTP compression is opt-in. We didn’t ask for it, so we didn’t get it. That means we’re downloading 3070 bytes when we could have just downloaded 941. Bad dog, no biscuit. +
As you can see, this code is already inefficient: it asked for (and received) uncompressed data. I know for a fact that this server supports gzip compression, but HTTP compression is opt-in. We didn’t ask for it, so we didn’t get it. That means we’re fetching 3070 bytes when we could have fetched 941. Bad dog, no biscuit.
But wait, it gets worse! To see just how inefficient this code is, let’s request the same feed a second time. @@ -307,8 +307,8 @@
Cache-Control
and Expires
to allow caching, Last-Modified
and ETag
to enable “not-modified” tracking. Even the Vary: Accept-Encoding
header hints that the server would support compression, if only you would ask for it. But you didn’t.
-HTTP is designed to work better than this. urllib
speaks HTTP like I speak Spanish — enough to get by in a jam, but not enough to hold a conversation. HTTP is a conversation. It’s time to upgrade to a library that speaks HTTP fluently.