Skip to content

full page caching for rustdoc pages in the CDN #1552

Closed
@syphar

Description

@syphar

We can use a CDN to improve world-wide speed for serving documentation. Even when we would improve server-side response times (for example by caching S3 requests locally on the webserver) we would look at the global latency between EU/US and else where (at least 100ms). If after CDN caching we still need to optimize server-side response times we can do it then.

Documentation is mostly static and can only change with a build it's a nearly perfect candidate for CDN caching. It's only nearly because we have rebuilds and also we are adding a header & footer.

I think we can leverage CDN caching for most parts of the site. When actively invalidating caches and using a good CDN we would still always have up-to-date content.

page-types and invalidation events

cached forever, no invalidation needed:

  • static assets with hashed filenames

can only change after any release for one specific crate

  • rustdoc pages (the header contains all versions of the crate)
  • latest-version redirects
  • release-internal redirects

can only change when we release new code:

  • documentation pages etc
  • styles

not really cachable:

  • search-results
  • release-lists (only cachable if we accept them being outdated for a certain amount of time)

requirements to the CDN

we need

  • fast invalidation (tag-based if possible, path/pattern based works for simple cases too).
  • CDN specific caching headers will be removed on the CDN level (to have control over the cache at all times)
  • logic on the edge to add CSP nonces at the edge

Nice to have would be:

  • serving stale content while updating the cache in the background.
  • soft purge: serves stale content while updating the cache. Prevents thundering herd problem when clearing the cache.

CloudFront

  • invalidations are probably too expensive to execute these on every release
  • path/pattern invalidations are possible, tags not
  • invalidations take minutes, sometimes 15.
  • secret headers I would need to research, I don't have a definitive answer yet. Perhaps solvable with lambda@edge or CF configuration.
  • but we already have it
  • Lambda@edge could probably solve the CSP issue, I didn't dig deeper on programming language support in there

Fastly

  • invalidations are free, and take 100ms worldwide
  • tag-based invalidations work,
  • secret headers too
  • soft purge and serving stale content is natively supported
  • fastly compute@edge has rust-support we could use for CSP (POC : https://github.com/syphar/docs-rs-fastly-csp/blob/main/src/main.rs )
  • PyPI gets fastly for free, I guess we could get it too.
  • but we don't have it yet, SSL needs work, contracts too.

CloudFlare

I didn't dig deeper yet on the feature set here.

browser caches

since we want to actively invalidate certain caches we won't cache these pages in the browser and limit browser caching to static assets with hashed filenames as currently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-backendArea: Webserver backendC-enhancementCategory: This is a new featureE-mediumEffort: This requires a fair amount of work

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions