Description
We can use a CDN to improve world-wide speed for serving documentation. Even when we would improve server-side response times (for example by caching S3 requests locally on the webserver) we would look at the global latency between EU/US and else where (at least 100ms). If after CDN caching we still need to optimize server-side response times we can do it then.
Documentation is mostly static and can only change with a build it's a nearly perfect candidate for CDN caching. It's only nearly because we have rebuilds and also we are adding a header & footer.
I think we can leverage CDN caching for most parts of the site. When actively invalidating caches and using a good CDN we would still always have up-to-date content.
page-types and invalidation events
cached forever, no invalidation needed:
- static assets with hashed filenames
can only change after any release for one specific crate
- rustdoc pages (the header contains all versions of the crate)
- latest-version redirects
- release-internal redirects
can only change when we release new code:
- documentation pages etc
- styles
not really cachable:
- search-results
- release-lists (only cachable if we accept them being outdated for a certain amount of time)
requirements to the CDN
we need
- fast invalidation (tag-based if possible, path/pattern based works for simple cases too).
- CDN specific caching headers will be removed on the CDN level (to have control over the cache at all times)
- logic on the edge to add CSP nonces at the edge
Nice to have would be:
- serving stale content while updating the cache in the background.
- soft purge: serves stale content while updating the cache. Prevents thundering herd problem when clearing the cache.
CloudFront
- invalidations are probably too expensive to execute these on every release
- path/pattern invalidations are possible, tags not
- invalidations take minutes, sometimes 15.
- secret headers I would need to research, I don't have a definitive answer yet. Perhaps solvable with lambda@edge or CF configuration.
- but we already have it
- Lambda@edge could probably solve the CSP issue, I didn't dig deeper on programming language support in there
Fastly
- invalidations are free, and take 100ms worldwide
- tag-based invalidations work,
- secret headers too
- soft purge and serving stale content is natively supported
- fastly compute@edge has rust-support we could use for CSP (POC : https://github.com/syphar/docs-rs-fastly-csp/blob/main/src/main.rs )
- PyPI gets fastly for free, I guess we could get it too.
- but we don't have it yet, SSL needs work, contracts too.
CloudFlare
I didn't dig deeper yet on the feature set here.
browser caches
since we want to actively invalidate certain caches we won't cache these pages in the browser and limit browser caching to static assets with hashed filenames as currently.