Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output CloudFront-friendly headers for diffs #1098

Merged
merged 1 commit into from
Mar 30, 2023

Conversation

Mr0grog
Copy link
Member

@Mr0grog Mr0grog commented Mar 30, 2023

This is part of web-monitoring#168 — the goal here is to make our diff cache less important so we can shrink it or even remove it (in favor of just having CloudFront do the job).

CloudFront needs a Date header to go with the Last-Modified and ETag headers from the stale? method, and works better still if we give it an actual cache age. This adds the expires_in helper which sets all the headers. I've also taken the strategy of setting a relatively short cache time but a much longer revalidation window when a stale response can be used, to balance concerns about updating the diff algorithm and effective long-term caching.

CloudFront needs a `Date` header to go with the `Last-Modified` and `ETag` headers from the `stale?` method, and works better still if we give it an actual cache age. This adds the `expires_in` helper which sets all the headers. I've also taken the strategy of setting a *relatively* short cache time but a much longer revalidation window when a stale response can be used, to balance concerns about updating the diff algorithm and effective long-term caching. I've also set a forever timeframe (100 years) if the request included a `?diff_version` query parameter (conceptually, you can say "I want a diff from an any algorithm newer than `diff_version=2023-01-01`" and accept a cached copy from an older algorithm rather than regenerating a new one.
@Mr0grog Mr0grog merged commit e85d7f6 into main Mar 30, 2023
@Mr0grog Mr0grog deleted the cloudfront-should-cache-diffs-but-it-does-not branch March 30, 2023 02:06
Mr0grog added a commit that referenced this pull request Mar 30, 2023
Mr0grog added a commit to edgi-govdata-archiving/web-monitoring-ops that referenced this pull request Mar 30, 2023
Mr0grog added a commit to edgi-govdata-archiving/web-monitoring-ops that referenced this pull request Mar 30, 2023
As part of putting things to rest and saving costs, we are removing the Redis cache and relying on CloudFront to do enough caching for us to solve the same use case. See also edgi-govdata-archiving/web-monitoring-db#1098.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant