Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caching: Validate stale cache entries and update cached headers #12232

Merged
merged 26 commits into from
Aug 7, 2020

Conversation

yosrym93
Copy link
Contributor

Commit Message: CacheFilter now validates cache stale entries and updates the cached headers.

Additional Description:
When a stale cache hit is found for a request, conditional headers are injected to the request for validation. If the a 304 response is received (not modified), a response is constructed from cache, and cached headers are updated. If a request already includes conditional headers it bypasses the CacheFilter, this is valid behavior and satisfies the minimum of #9855 for production -- however, it can lead to missed caching opportunities.
Signed-off-by: Yosry Ahmed yosryahmed@google.com

Risk Level: Low
Testing: Integration tests
Docs Changes: N/A
Release Notes: N/A
Fixes #9976

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…stale cache entries if needed

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…rrectly, not tested yet

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…that were discovered by tests, and minor refactoring to the CacheFilter

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…r functions preconditions in the CacheFilter

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@yosrym93 yosrym93 requested a review from jmarantz as a code owner July 22, 2020 20:32
@yosrym93
Copy link
Contributor Author

This PR is a little bit lacking in tests, but right now I am working on a separate PR to improve tests and coverage for the whole cache tree. The CacheFilter is still WIP.

Http::LowerCaseString key(std::string(cached_header.key().getStringView()));
absl::string_view value = cached_header.value().getStringView();
if (!response_headers->get(key)) {
response_headers->setCopy(key, value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached header key is copied twice here, once when creating Http::LowerCaseString key and once inside setCopy. I am open to suggestions to avoid this.

The get function accepts a LowerCaseString, so I cannot pass the cached_header.key() directly, and there is no setViaMove to move the key instead of copying it.

Perhaps I am wrong in any of this, looking forward to any suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be as good as it can be, depending on the limitations you have. If you have to use setCopy, then you have to copy into the map. Any optimization would need the single write to be in there. But, you first need to check for the (lowercased) key before you do that, so you're stuck.

However, what's the lifetime of cached_header? Or more broadly lookup_result_->headers_? If it outlives response_headers, then you're free to use setReference().

(Also, although this is a wild tangent and not appropriate to address in this PR, I wonder if LowerCaseString could be changed to accept "I promise this is already lower". Since it's an attempt to never let non-lowercasing slip in, maybe that's not a good idea, but... it seems like you really could be sure that a header you previously cached would be well behaved.)

Copy link
Contributor Author

@yosrym93 yosrym93 Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cached_header is a member of a member of the CacheFilter, so it only outlives the response if the filter does. However, I think it is dangerous to rely on that.
Another thought I had is to use addViaMove and pass to it cached_header.key() and cached_header.value() as the lookup_result_ will not be used any more. Two problems prevent that:

  • The only version of iterate I found is a constant version. So I cannot pass these parameters to cached_header.key().
  • I still have to create a LowerCaseString to check if the header already exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was moved to processSuccessfulValidation function.

[this](Http::ResponseTrailerMapPtr&& trailers) { onTrailers(std::move(trailers)); });
}
lookup_result_ = std::make_unique<LookupResult>(std::move(result));
encodeCachedResponse();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code removed from case CacheEntryStatus::Ok: is refactored into encodeCachedResponse() with some modifications so that it works for encoding cached responses in both decoding and encoding scenarios.

decoding scenario -> fresh cache hit found when decoding
encoding scenario -> stale cache hit found when decoding, successfully validated when encoding

} else if (response_has_trailers_) {
getTrailers();
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is refactored out of onHeaders, see the comment I left there.

…r of pointers as containers of consts caused an error on Windows

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@zuercher
Copy link
Member

cc @toddmgreer (extension co-owner)

Copy link
Contributor

@fredlas fredlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have almost no experience with this code, so mostly just structural. What I can see/understand looks good, though!

source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
Http::LowerCaseString key(std::string(cached_header.key().getStringView()));
absl::string_view value = cached_header.value().getStringView();
if (!response_headers->get(key)) {
response_headers->setCopy(key, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be as good as it can be, depending on the limitations you have. If you have to use setCopy, then you have to copy into the map. Any optimization would need the single write to be in there. But, you first need to check for the (lowercased) key before you do that, so you're stuck.

However, what's the lifetime of cached_header? Or more broadly lookup_result_->headers_? If it outlives response_headers, then you're free to use setReference().

(Also, although this is a wild tangent and not appropriate to address in this PR, I wonder if LowerCaseString could be changed to accept "I promise this is already lower". Since it's an attempt to never let non-lowercasing slip in, maybe that's not a good idea, but... it seems like you really could be sure that a header you previously cached would be well behaved.)

source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…ed is invalid or non-existent

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@yosrym93
Copy link
Contributor Author

@toddmgreer I think this is ready for you to take a look. Josiah and Fred took passes so hopefully this shouldn't take too long.

source/extensions/filters/http/cache/cacheability_utils.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…nd cleaner

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@yosrym93
Copy link
Contributor Author

@toddmgreer I managed to rewrite the state management in the filter in a simpler and cleaner way. Take a look whenever you can!

source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
source/common/http/headers.h Outdated Show resolved Hide resolved
source/extensions/filters/http/cache/cache_filter.cc Outdated Show resolved Hide resolved
@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 4, 2020

@dio I merged master again, hopefully this will fix it

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 4, 2020

The clang tidy error still exists. All the checks used to pass before merging master and the nit commit (moving the include). I don't know what might be the problem.

@dio
Copy link
Member

dio commented Aug 4, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dio
Copy link
Member

dio commented Aug 4, 2020

@yosrym93, sorry, seems like you need to have #12449 as well. Currently, yours is still: https://github.com/envoyproxy/envoy/blob/634f21e6b3d551a258d669283fa8a94a70681574/ci/run_clang_tidy.sh (#12426).

@yanavlasov
Copy link
Contributor

Hello, sorry for the delay, I was pulled into working on customer case.
I've noticed that integration test only tests responses with 304 and 200 statuses. How are responses with other statuses (i.e. 404) tested?

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 4, 2020

Hello, sorry for the delay, I was pulled into working on customer case.
I've noticed that integration test only tests responses with 304 and 200 statuses. How are responses with other statuses (i.e. 404) tested?

According to: https://tools.ietf.org/html/rfc7234#section-4.3.3

304 responses (not modified) require special handling as they mean we should update and serve the cached response that is being validated.
Other responses (whether successful 2xx or errors 4xx) are forwarded to the client.

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 4, 2020

@dio I believe #12449 was already merged, but I merged master anyway just to make sure.
Edit: run_clang_tidy.sh was not changed in #12449.

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 4, 2020

Still no luck with the clang-tidy problem, @dio thoughts?

yosrym93 and others added 3 commits August 4, 2020 19:58
Copy link
Member

@dio dio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosrym93 thank you for your patience! Since you need to merge main, I'd like to take this opportunity to suggest some nitpiciking.

Furthermore, I think @yanavlasov's comment is worth to be addressed by adding comments near to the relevant lines?

Thanks!

source/extensions/filters/http/cache/cache_filter.h Outdated Show resolved Hide resolved
@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 5, 2020

@dio done!

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 5, 2020

Do I need to do anything about the cherry-picked fix now?

Copy link
Member

@dio dio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Thanks.

@dio
Copy link
Member

dio commented Aug 5, 2020

@lizan do you want to sign this as a senior maintainer? Thanks!

@yosrym93
Copy link
Contributor Author

yosrym93 commented Aug 5, 2020

@lizan is there anything this PR still needs to be merged?

Comment on lines +67 to +72
const LowerCaseString IfMatch{"if-match"};
const LowerCaseString IfNoneMatch{"if-none-match"};
const LowerCaseString IfModifiedSince{"if-modified-since"};
const LowerCaseString IfUnmodifiedSince{"if-unmodified-since"};
const LowerCaseString IfRange{"if-range"};
const LowerCaseString LastModified{"last-modified"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we register them as static inline headers instead of using get through out the implementation?

Copy link
Contributor Author

@yosrym93 yosrym93 Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a inline_headers_handles.h file to the cache tree that registers all the used headers in this module (using Http::RegisterCustomInlineHeader), to avoid registering them in every file that uses them. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess this won't work without making the handles static which is against the style guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also no setReferenceKeyInline to replace setReferenceKey(Http::CustomHeaders::get().IfModifiedSince, date).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, can you do it as a follow up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate further on what you are suggesting as a follow up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CacheFilter: Validate stale cache entries and update cached headers