Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caching: Validate stale cache entries and update cached headers #12232

Merged
merged 26 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
5e93c44
Requests with conditional headers bypass the CacheFilter
yosrym93 Jun 30, 2020
73bba3a
Implemented injecting conditional headers in the request to validate …
yosrym93 Jun 30, 2020
659b0e7
Responses to cache validation requests are now checked and handled co…
yosrym93 Jul 6, 2020
0f062bc
Added integration tests for cache entry validation, fixed a few bugs …
yosrym93 Jul 9, 2020
a224c6e
Updating cached responses is now RFC-compliant, also added ASSERTS fo…
yosrym93 Jul 20, 2020
7adb280
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Jul 22, 2020
d228ada
Improved code readability and style
yosrym93 Jul 22, 2020
706e43b
Changed conditionalHeaders from a vector of const pointers to a vecto…
yosrym93 Jul 22, 2020
08d0622
Added fall through marker comment for GCC
yosrym93 Jul 22, 2020
ffb5d96
Improved readability and style
yosrym93 Jul 23, 2020
ea36c44
If-Modified-Since validation header falls back to Date if Last-Modifi…
yosrym93 Jul 23, 2020
c46bdb5
Strayed trace statements
yosrym93 Jul 24, 2020
bdd8e4f
Code readability improvements
yosrym93 Jul 25, 2020
f46b108
Redesigned state management in the CacheFilter -- now it is simpler a…
yosrym93 Jul 25, 2020
5316ead
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Jul 27, 2020
dcd2ada
Code clean up
yosrym93 Jul 28, 2020
6c8edce
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Jul 29, 2020
a592448
Nit: moved an include from a header file to a cc file
yosrym93 Aug 3, 2020
6f90f8e
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Aug 3, 2020
634f21e
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Aug 4, 2020
99f3aef
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Aug 4, 2020
ec27fb0
Fixed a minor logical error
yosrym93 Aug 4, 2020
a1330a4
cherry-pick from (#12481): tidy: use libstdc++
lizan Aug 4, 2020
daaf757
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Aug 5, 2020
eb7bc75
Revised the punctuation of the whole cache tree
yosrym93 Aug 5, 2020
fd2d65b
Merge remote-tracking branch 'upstream/master' into validation
yosrym93 Aug 5, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions source/common/http/headers.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ class CustomHeaderValues {
const LowerCaseString ContentEncoding{"content-encoding"};
const LowerCaseString Etag{"etag"};
const LowerCaseString GrpcAcceptEncoding{"grpc-accept-encoding"};
const LowerCaseString IfMatch{"if-match"};
const LowerCaseString IfNoneMatch{"if-none-match"};
const LowerCaseString IfModifiedSince{"if-modified-since"};
const LowerCaseString IfUnmodifiedSince{"if-unmodified-since"};
const LowerCaseString IfRange{"if-range"};
const LowerCaseString LastModified{"last-modified"};
Comment on lines +67 to +72
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we register them as static inline headers instead of using get through out the implementation?

Copy link
Contributor Author

@yosrym93 yosrym93 Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a inline_headers_handles.h file to the cache tree that registers all the used headers in this module (using Http::RegisterCustomInlineHeader), to avoid registering them in every file that uses them. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess this won't work without making the handles static which is against the style guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also no setReferenceKeyInline to replace setReferenceKey(Http::CustomHeaders::get().IfModifiedSince, date).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, can you do it as a follow up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate further on what you are suggesting as a follow up?

const LowerCaseString Origin{"origin"};
const LowerCaseString OtSpanContext{"x-ot-span-context"};
const LowerCaseString Pragma{"pragma"};
Expand Down
2 changes: 2 additions & 0 deletions source/extensions/filters/http/cache/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@ envoy_cc_library(
":cache_headers_utils_lib",
":cacheability_utils_lib",
":http_cache_lib",
"//source/common/common:enum_to_int",
"//source/common/common:logger_lib",
"//source/common/common:macros",
"//source/common/http:header_map_lib",
"//source/common/http:headers_lib",
"//source/common/http:utility_lib",
"//source/extensions/filters/http/common:pass_through_filter_lib",
"@envoy_api//envoy/extensions/filters/http/cache/v3alpha:pkg_cc_proto",
],
Expand Down
296 changes: 241 additions & 55 deletions source/extensions/filters/http/cache/cache_filter.cc

Large diffs are not rendered by default.

69 changes: 62 additions & 7 deletions source/extensions/filters/http/cache/cache_filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include <vector>

#include "envoy/extensions/filters/http/cache/v3alpha/cache.pb.h"
#include "envoy/http/header_map.h"

#include "common/common/logger.h"

Expand Down Expand Up @@ -38,19 +39,50 @@ class CacheFilter : public Http::PassThroughFilter,
Http::FilterDataStatus encodeData(Buffer::Instance& buffer, bool end_stream) override;

private:
// Utility functions: make any necessary checks and call the corresponding lookup_ functions.
void getBody();
void onHeaders(LookupResult&& result);
void getTrailers();

// Callbacks for HttpCache to call when headers/body/trailers are ready.
void onHeaders(LookupResult&& result, Http::RequestHeaderMap& request_headers);
void onBody(Buffer::InstancePtr&& body);
void onTrailers(Http::ResponseTrailerMapPtr&& trailers);

// Precondition: lookup_result_ points to a cache lookup result that requires validation.
// filter_state_ is ValidatingCachedResponse.
// Serves a validated cached response after updating it with a 304 response.
void processSuccessfulValidation(Http::ResponseHeaderMap& response_headers);

// Precondition: lookup_result_ points to a cache lookup result that requires validation.
// filter_state_ is ValidatingCachedResponse.
// Checks if a cached entry should be updated with a 304 response.
bool shouldUpdateCachedEntry(const Http::ResponseHeaderMap& response_headers) const;

// Precondition: lookup_result_ points to a cache lookup result that requires validation.
// Should only be called during onHeaders as it modifies RequestHeaderMap.
// Adds required conditional headers for cache validation to the request headers
// according to the present cache lookup result headers.
void injectValidationHeaders(Http::RequestHeaderMap& request_headers);

// Precondition: lookup_result_ points to a fresh or validated cache look up result.
// filter_state_ is ValidatingCachedResponse.
// Adds a cache lookup result to the response encoding stream.
// Can be called during decoding if a valid cache hit is found,
// or during encoding if a cache entry was validated successfully.
void encodeCachedResponse();

// Precondition: finished adding a response from cache to the response encoding stream.
// Updates filter_state_ and continues the encoding stream if necessary.
void finalizeEncodingCachedResponse();

TimeSource& time_source_;
HttpCache& cache_;
LookupContextPtr lookup_;
InsertContextPtr insert_;
LookupResultPtr lookup_result_;
yosrym93 marked this conversation as resolved.
Show resolved Hide resolved

// Tracks what body bytes still need to be read from the cache. This is
// currently only one Range, but will expand when full range support is added. Initialized by
// onOkHeaders.
// Tracks what body bytes still need to be read from the cache. This is currently only one Range,
// but will expand when full range support is added. Initialized by encodeCachedResponse.
std::vector<AdjustedByteRange> remaining_body_;

// True if the response has trailers.
Expand All @@ -61,9 +93,32 @@ class CacheFilter : public Http::PassThroughFilter,
// https://httpwg.org/specs/rfc7234.html#response.cacheability
bool request_allows_inserts_ = false;

// Used for coordinating between decodeHeaders and onHeaders.
enum class GetHeadersState { Initial, FinishedGetHeadersCall, GetHeadersResultUnusable };
GetHeadersState state_ = GetHeadersState::Initial;
enum class FilterState {
Initial,

// CacheFilter::decodeHeaders called lookup->getHeaders() but onHeaders was not called yet
// (lookup result not ready) -- the decoding stream should be stopped until the cache lookup
// result is ready.
WaitingForCacheLookup,

// CacheFilter::encodeHeaders called encodeCachedResponse() but encoding the cached response is
// not finished yet -- the encoding stream should be stopped until it is finished.
WaitingForCacheBody,

// Cache lookup did not find a cached response for this request.
NoCachedResponseFound,

// Cache lookup found a cached response that requires validation.
ValidatingCachedResponse,

// Cache lookup found a fresh cached response and it is being added to the encoding stream.
DecodeServingFromCache,

// The cached response was successfully added to the encoding stream (either during decoding or
// encoding).
ResponseServedFromCache
};
FilterState filter_state_ = FilterState::Initial;
};

} // namespace Cache
Expand Down
22 changes: 11 additions & 11 deletions source/extensions/filters/http/cache/cache_headers_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,19 @@ namespace Extensions {
namespace HttpFilters {
namespace Cache {

// Utility functions used in RequestCacheControl & ResponseCacheControl
// Utility functions used in RequestCacheControl & ResponseCacheControl.
namespace {
// A directive with an invalid duration is ignored, the RFC does not specify a behavior:
// https://httpwg.org/specs/rfc7234.html#delta-seconds
OptionalDuration parseDuration(absl::string_view s) {
OptionalDuration duration;
// Strip quotation marks if any
// Strip quotation marks if any.
if (s.size() > 1 && s.front() == '"' && s.back() == '"') {
s = s.substr(1, s.size() - 2);
}
long num;
if (absl::SimpleAtoi(s, &num) && num >= 0) {
// s is a valid string of digits representing a positive number
// s is a valid string of digits representing a positive number.
duration = std::chrono::seconds(num);
}
return duration;
Expand Down Expand Up @@ -87,12 +87,12 @@ ResponseCacheControl::ResponseCacheControl(absl::string_view cache_control_heade
std::tie(directive, argument) = separateDirectiveAndArgument(full_directive);

if (directive == "no-cache") {
// If no-cache directive has arguments they are ignored - not handled
// If no-cache directive has arguments they are ignored - not handled.
must_validate_ = true;
} else if (directive == "must-revalidate" || directive == "proxy-revalidate") {
no_stale_ = true;
} else if (directive == "no-store" || directive == "private") {
// If private directive has arguments they are ignored - not handled
// If private directive has arguments they are ignored - not handled.
no_store_ = true;
} else if (directive == "no-transform") {
no_transform_ = true;
Expand Down Expand Up @@ -151,12 +151,12 @@ SystemTime CacheHeadersUtils::httpTime(const Http::HeaderEntry* header_entry) {
absl::Time time;
const std::string input(header_entry->value().getStringView());

// Acceptable Date/Time Formats per
// Acceptable Date/Time Formats per:
// https://tools.ietf.org/html/rfc7231#section-7.1.1.1
//
// Sun, 06 Nov 1994 08:49:37 GMT ; IMF-fixdate
// Sunday, 06-Nov-94 08:49:37 GMT ; obsolete RFC 850 format
// Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
// Sun, 06 Nov 1994 08:49:37 GMT ; IMF-fixdate.
// Sunday, 06-Nov-94 08:49:37 GMT ; obsolete RFC 850 format.
// Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format.
static const char* rfc7231_date_formats[] = {"%a, %d %b %Y %H:%M:%S GMT",
"%A, %d-%b-%y %H:%M:%S GMT", "%a %b %e %H:%M:%S %Y"};

Expand All @@ -178,15 +178,15 @@ absl::optional<uint64_t> CacheHeadersUtils::readAndRemoveLeadingDigits(absl::str
}
uint64_t new_val = (val * 10) + (cur - '0');
if (new_val / 8 < val) {
// Overflow occurred
// Overflow occurred.
return absl::nullopt;
}
val = new_val;
++bytes_consumed;
}

if (bytes_consumed) {
// Consume some digits
// Consume some digits.
str.remove_prefix(bytes_consumed);
return val;
}
Expand Down
30 changes: 26 additions & 4 deletions source/extensions/filters/http/cache/cacheability_utils.cc
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#include "extensions/filters/http/cache/cacheability_utils.h"

#include "envoy/http/header_map.h"

#include "common/common/macros.h"
#include "common/common/utility.h"

Expand All @@ -14,10 +16,18 @@ const absl::flat_hash_set<absl::string_view>& cacheableStatusCodes() {
// https://tools.ietf.org/html/rfc7231#section-6.1,
// https://tools.ietf.org/html/rfc7538#section-3,
// https://tools.ietf.org/html/rfc7725#section-3
// TODO(yosrym93): the list of cacheable status codes should be configurable
// TODO(yosrym93): the list of cacheable status codes should be configurable.
CONSTRUCT_ON_FIRST_USE(absl::flat_hash_set<absl::string_view>, "200", "203", "204", "206", "300",
"301", "308", "404", "405", "410", "414", "451", "501");
}

const std::vector<const Http::LowerCaseString*>& conditionalHeaders() {
// As defined by: https://httpwg.org/specs/rfc7232.html#preconditions.
CONSTRUCT_ON_FIRST_USE(
std::vector<const Http::LowerCaseString*>, &Http::CustomHeaders::get().IfMatch,
&Http::CustomHeaders::get().IfNoneMatch, &Http::CustomHeaders::get().IfModifiedSince,
&Http::CustomHeaders::get().IfUnmodifiedSince, &Http::CustomHeaders::get().IfRange);
}
} // namespace

Http::RegisterCustomInlineHeader<Http::CustomInlineHeaderRegistry::Type::RequestHeaders>
Expand All @@ -29,8 +39,20 @@ bool CacheabilityUtils::isCacheableRequest(const Http::RequestHeaderMap& headers
const absl::string_view method = headers.getMethodValue();
const absl::string_view forwarded_proto = headers.getForwardedProtoValue();
const Http::HeaderValues& header_values = Http::Headers::get();

// Check if the request contains any conditional headers.
// For now, requests with conditional headers bypass the CacheFilter.
// This behavior does not cause any incorrect results, but may reduce the cache effectiveness.
// If needed to be handled properly refer to:
// https://httpwg.org/specs/rfc7234.html#validation.received
for (auto conditional_header : conditionalHeaders()) {
if (headers.get(*conditional_header)) {
return false;
}
}

// TODO(toddmgreer): Also serve HEAD requests from cache.
// TODO(toddmgreer): Check all the other cache-related headers.
// Cache-related headers are checked in HttpCache::LookupRequest.
return headers.Path() && headers.Host() && !headers.getInline(authorization_handle.handle()) &&
(method == header_values.MethodValues.Get) &&
(forwarded_proto == header_values.SchemeValues.Http ||
Expand All @@ -42,8 +64,8 @@ bool CacheabilityUtils::isCacheableResponse(const Http::ResponseHeaderMap& heade
ResponseCacheControl response_cache_control(cache_control);

// Only cache responses with explicit validation data, either:
// max-age or s-maxage cache-control directives with date header
// expires header
// max-age or s-maxage cache-control directives with date header.
// expires header.
const bool has_validation_data =
(headers.Date() && response_cache_control.max_age_.has_value()) ||
headers.get(Http::Headers::get().Expires);
Expand Down
12 changes: 6 additions & 6 deletions source/extensions/filters/http/cache/cacheability_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,17 @@ namespace HttpFilters {
namespace Cache {
class CacheabilityUtils {
public:
// Checks if a request can be served from cache,
// this does not depend on cache-control headers as
// Checks if a request can be served from cache.
// This does not depend on cache-control headers as
// request cache-control headers only decide whether
// validation is required and whether the response can be cached
// validation is required and whether the response can be cached.
static bool isCacheableRequest(const Http::RequestHeaderMap& headers);

// Checks if a response can be stored in cache
// Checks if a response can be stored in cache.
// Note that if a request is not cacheable according to 'isCacheableRequest'
// then its response is also not cacheable
// then its response is also not cacheable.
// Therefore, isCacheableRequest, isCacheableResponse and CacheFilter::request_allows_inserts_
// together should cover https://httpwg.org/specs/rfc7234.html#response.cacheability
// together should cover https://httpwg.org/specs/rfc7234.html#response.cacheability.
static bool isCacheableResponse(const Http::ResponseHeaderMap& headers);
};
} // namespace Cache
Expand Down
28 changes: 14 additions & 14 deletions source/extensions/filters/http/cache/http_cache.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ LookupRequest::LookupRequest(const Http::RequestHeaderMap& request_headers, Syst
// TODO(toddmgreer): handle the resultant vector<AdjustedByteRange> in CacheFilter::onOkHeaders.
// Range Requests are only valid for GET requests
if (request_headers.getMethodValue() == Http::Headers::get().MethodValues.Get) {
// TODO(cbdm): using a constant limit of 10 ranges, could make this into a parameter
// TODO(cbdm): using a constant limit of 10 ranges, could make this into a parameter.
const int RangeSpecifierLimit = 10;
request_range_spec_ = RangeRequests::parseRanges(request_headers, RangeSpecifierLimit);
}
Expand All @@ -95,24 +95,24 @@ void LookupRequest::initializeRequestCacheControl(const Http::RequestHeaderMap&
if (!cache_control.empty()) {
request_cache_control_ = RequestCacheControl(cache_control);
} else {
// According to: https://httpwg.org/specs/rfc7234.html#header.pragma
// According to: https://httpwg.org/specs/rfc7234.html#header.pragma,
// when Cache-Control header is missing, "Pragma:no-cache" is equivalent to
// "Cache-Control:no-cache" Any other directives are ignored
// "Cache-Control:no-cache". Any other directives are ignored.
request_cache_control_.must_validate_ = RequestCacheControl(pragma).must_validate_;
}
}

bool LookupRequest::requiresValidation(const Http::ResponseHeaderMap& response_headers) const {
// TODO(yosrym93): Store parsed response cache-control in cache instead of parsing it on every
// lookup
// lookup.
const absl::string_view cache_control =
response_headers.getInlineValue(response_cache_control_handle.handle());
const ResponseCacheControl response_cache_control(cache_control);

const SystemTime response_time = CacheHeadersUtils::httpTime(response_headers.Date());

if (timestamp_ < response_time) {
// Response time is in the future, validate response
// Response time is in the future, validate response.
return true;
}

Expand All @@ -121,14 +121,14 @@ bool LookupRequest::requiresValidation(const Http::ResponseHeaderMap& response_h
request_cache_control_.max_age_.value() < response_age;
if (response_cache_control.must_validate_ || request_cache_control_.must_validate_ ||
request_max_age_exceeded) {
// Either the request or response explicitly require validation or a request max-age requirement
// is not satisfied
// Either the request or response explicitly require validation, or a request max-age
// requirement is not satisfied.
return true;
}

// CacheabilityUtils::isCacheableResponse(..) guarantees that any cached response satisfies this
// CacheabilityUtils::isCacheableResponse(..) guarantees that any cached response satisfies this.
// When date metadata injection for responses with no date
// is implemented, this ASSERT will need to be updated
// is implemented, this ASSERT will need to be updated.
ASSERT((response_headers.Date() && response_cache_control.max_age_.has_value()) ||
response_headers.get(Http::Headers::get().Expires),
"Cache entry does not have valid expiration data.");
Expand All @@ -139,15 +139,15 @@ bool LookupRequest::requiresValidation(const Http::ResponseHeaderMap& response_h
: CacheHeadersUtils::httpTime(response_headers.get(Http::Headers::get().Expires));

if (timestamp_ > expiration_time) {
// Response is stale, requires validation
// if the response does not allow being served stale
// or the request max-stale directive does not allow it
// Response is stale, requires validation if
// the response does not allow being served stale,
// or the request max-stale directive does not allow it.
const bool allowed_by_max_stale =
request_cache_control_.max_stale_.has_value() &&
request_cache_control_.max_stale_.value() > timestamp_ - expiration_time;
return response_cache_control.no_stale_ || !allowed_by_max_stale;
} else {
// Response is fresh, requires validation only if there is an unsatisfied min-fresh requirement
// Response is fresh, requires validation only if there is an unsatisfied min-fresh requirement.
const bool min_fresh_unsatisfied =
request_cache_control_.min_fresh_.has_value() &&
request_cache_control_.min_fresh_.value() > expiration_time - timestamp_;
Expand Down Expand Up @@ -190,7 +190,7 @@ bool adjustByteRangeSet(std::vector<AdjustedByteRange>& response_ranges,

for (const RawByteRange& spec : request_range_spec) {
if (spec.isSuffix()) {
// spec is a suffix-byte-range-spec
// spec is a suffix-byte-range-spec.
if (spec.suffixLength() == 0) {
// This range is unsatisfiable, so skip it.
continue;
Expand Down
5 changes: 3 additions & 2 deletions source/extensions/filters/http/cache/http_cache.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ struct LookupResult {
// True if the cached response has trailers.
bool has_trailers_ = false;
};
using LookupResultPtr = std::unique_ptr<LookupResult>;

// Produces a hash of key that is consistent across restarts, architectures,
// builds, and configurations. Caches that store persistent entries based on a
Expand Down Expand Up @@ -305,8 +306,8 @@ class HttpCache {
//
// This is called when an expired cache entry is successfully validated, to
// update the cache entry.
virtual void updateHeaders(LookupContextPtr&& lookup_context,
Http::ResponseHeaderMapPtr&& response_headers) PURE;
virtual void updateHeaders(const LookupContext& lookup_context,
const Http::ResponseHeaderMap& response_headers) PURE;

// Returns statically known information about a cache.
virtual CacheInfo cacheInfo() const PURE;
Expand Down
Loading