-
Notifications
You must be signed in to change notification settings - Fork 5.1k
common: support parsing cookies into a map #17811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common: support parsing cookies into a map #17811
Conversation
Added a method to HTTP Utilities to parse out cookies into a map. Changed oauth2 extension to lookups from a cookie map to reduce redundant scanning of cookies. Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Hi @theshubhamp, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation parses all cookies into a map even when we just need a subset.
Should parseCookies(...)
be enhanced to accept a filter
to filter down on keys at the time of parsing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if possible performance gain can balance increased code complexity. I'd rather keep it as it is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that instantiates many small strings, I think this should pass a lambda like the following to forEachCookies:
if (k == "OauthExpires") {
expires_ = v;
}
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (k == "OauthExpires") { expires_ = v; } ...
This may require consumers of forEachCookies
to worry about duplicate cookie semantics discussed on this thread #17811 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this variant of parseCookies that accepts a key filter lambda ? It can support both static checks like these ^ OR something more dynamic like checking set / map membership.
absl::flat_hash_map<std::string, std::string>
Utility::parseCookies(const RequestHeaderMap& headers, const std::function<bool(absl::string_view)>& key_filter) {
absl::flat_hash_map<std::string, std::string> cookies;
forEachCookie(headers, Http::Headers::get().Cookie,
[&cookies, &key_filter](absl::string_view k, absl::string_view v) -> bool {
if (key_filter(k)) {
cookies.emplace(k, v);
}
// continue iterating until all cookies are processed.
return true;
});
return cookies;
}
An overload can be added that that defaults key_filter
to a lambda that always returns true to provide a parse all cookies
fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the latter sounds good to me if we care about uniform handling of duplicate cookies. Checking for set membership may be even faster than the static string comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Its being used like this in the oauth2 filter now:
...
const auto& cookies = Http::Utility::parseCookies(headers, [](absl::string_view key) -> bool {
return key == "OauthExpires" || key == "BearerToken" || key == "OauthHMAC";
});
...
Kept static comparisons inside the predicate thinking that serial =
comparisons should be (a little ?) faster than using a set which requires hash computation before every check (if I am not mistaken)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added some comments.
/wait
source/common/http/utility.cc
Outdated
// If the key matches, parse the value from the rest of the cookie string. | ||
if (k == key) { | ||
void forEachCookie(const HeaderMap& headers, const LowerCaseString& cookie_header, | ||
const std::function<bool (const absl::string_view&, const absl::string_view&)> cookie_consumer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absl::string_view
should be passed by value as the spec suggests. No need to add const
and &
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out! I have dropped const
and &
as recommended in the spec
source/common/http/utility.cc
Outdated
return value; | ||
} | ||
|
||
std::map<std::string, std::string> Utility::parseCookies(const RequestHeaderMap& headers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use absl::flat_hash_map
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to using absl::flat_hash_map
For my curiosity: Why should it be preferred over std::map
? I tried looking at the abseil docs and other sources but could not get a simple answer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Algorithmic complexity of inserts and lookups is logarithmic for std::map
. For hashes it's basically O(1).
expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if possible performance gain can balance increased code complexity. I'd rather keep it as it is for now.
const auto expires_it = cookies.find("OauthExpires"); | ||
expires_ = expires_it != cookies.end() ? expires_it->second : EMPTY_STRING; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace it with a function defined in the anonymous namespace and call the function also for token_
and hmac_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added std::string findValue(...)
to unpack the hash_map as we need it here.
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Following best practices mentioned on https://abseil.io/docs/cpp/guides/strings#string_view: string_view objects are very lightweight, so you should always pass them by value within your methods and functions; don’t pass a const absl::string_view &. (Passing absl::string_view instead of const absl::string_view & has the same algorithmic complexity, but because of register allocation and parameter passing rules, it is generally faster to pass by value in this case.) Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
source/common/http/utility.cc
Outdated
return std::string{result}; | ||
// Iterate over each cookie & return if its value is not empty. | ||
forEachCookie(headers, cookie, [&key, &value](absl::string_view k, absl::string_view v) -> bool { | ||
if (key == k && !v.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this new code implements different behavior and theoretically may break certain setups. Consider the case of
Cookie: a=; b=1; a=2
Cookie: a=3
The old parseCookie(headers, "a", "cookie")
would return 3
. The new code returns 2
if I'm not mistaken.
I think it makes sense to add a test for this case and make sure it passes for both versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this new version is not equivalent to the old one. I'll try to make both the variants (single-cookie and map) compatible with the old behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note it appears that even the old behaviours haven't been consistent over time.
For example if we use this new test as a reference:
TEST(HttpUtility, TestParseCookieDuplicates) {
TestRequestHeaderMapImpl headers{
{"someheader", "10.0.0.1"},
{"cookie", "a=; b=1; a=2"},
{"cookie", "a=3; b=2"}};
EXPECT_EQ(Utility::parseCookieValue(headers, "a"), "3");
EXPECT_EQ(Utility::parseCookieValue(headers, "b"), "1");
}
It fails before #17560 landed (changed header iter. from reverse to natural order)
# git status
HEAD detached at b57827c20
# bazel test //test/common/http:utility_test
...
NFO: 12 processes: 1 internal, 11 darwin-sandbox.
INFO: Build completed, 1 test FAILED, 12 total actions
//test/common/http:utility_test FAILED in 2.5s
# less /private/var/tmp/_bazel_shubhamp/306aac4fa57fcfade8ca38e576399a40/execroot/envoy/bazel-out/darwin-fastbuild/testlogs/test/common/http/utility_test/test.log
...
[ RUN ] HttpUtility.TestParseCookieDuplicates
test/common/http/utility_test.cc:559: Failure
Expected equality of these values:
Utility::parseCookieValue(headers, "b")
Which is: "2"
"1"
Stack trace:
0x10b50f3aa: Envoy::Http::HttpUtility_TestParseCookieDuplicates_Test::TestBody()
0x1148c1fc4: testing::internal::HandleSehExceptionsInMethodIfSupported<>()
0x114897a2b: testing::internal::HandleExceptionsInMethodIfSupported<>()
0x114897963: testing::Test::Run()
0x114898977: testing::TestInfo::Run()
... Google Test internal frames ...
But passes on main:
# less less /private/var/tmp/_bazel_shubhamp/306aac4fa57fcfade8ca38e576399a40/execroot/envoy/bazel-out/darwin-fastbuild/testlogs/test/common/http/utility_test/test.log
...
[ RUN ] HttpUtility.TestParseCookieDuplicates
[ OK ] HttpUtility.TestParseCookieDuplicates (0 ms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behaviour on main (and the older version before it got changed) is a little inconsistent because it:
- picks the first matching cookie from a single
cookie
header (i.e. resolvesa
's value to an empty string if the header value looks like:"a=; b=1; a=2"
) - but exhaustively searches remaining cookie headers until a non empty value is found.
And this ^ makes parsing out duplicate cookies spilt between multiple cookie headers fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions on how to proceed here ?
- Keep the behaviour from
main
as-is - Pick the
first cookie value
ORfirst non-empty cookie value
- Pick the
last cookie value
ORlast non-empty cookie value
I'm inclined towards approaches 2 or 3 because it keeps the parsing simple in the long run - but this can break some setups.
RFC 6265 - HTTP State Management Mechanism itself is pretty ambiguous about duplicate cookies and ordering:
Although cookies are serialized linearly in the Cookie header,
servers SHOULD NOT rely upon the serialization order. In particular,
if the Cookie header contains two cookies with the same name (e.g.,
that were set with different Path or Domain attributes), servers
SHOULD NOT rely upon the order in which these cookies appear in the
header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... That's a good catch! If the spec explicitly states that the order is unimportant then just pick the first value irrespective of its emptiness for the sake of perf.
A second reviewer may give a second opinion on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to take the first value for perf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, switched to picking up the first occurrence of a cookie!
Added tests so that this behavioural quirk has coverage.
/wait |
source/common/http/utility.cc
Outdated
return std::string{result}; | ||
// Iterate over each cookie & return if its value is not empty. | ||
forEachCookie(headers, cookie, [&key, &value](absl::string_view k, absl::string_view v) -> bool { | ||
if (key == k && !v.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to take the first value for perf.
source/common/http/utility.cc
Outdated
return v; | ||
|
||
bool continue_iteration = cookie_consumer(k, v); | ||
if (!continue_iteration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!continue_iteration) { | |
if (!cookie_consumer(k, v)) { |
then you don't need continue_iteration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, moved the invocation inside if condition itself
expires_ = Http::Utility::parseCookieValue(headers, "OauthExpires"); | ||
token_ = Http::Utility::parseCookieValue(headers, "BearerToken"); | ||
hmac_ = Http::Utility::parseCookieValue(headers, "OauthHMAC"); | ||
const auto& cookies = Http::Utility::parseCookies(headers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that instantiates many small strings, I think this should pass a lambda like the following to forEachCookies:
if (k == "OauthExpires") {
expires_ = v;
}
...
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
…in headers + Tests for coverage Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
…2 ext Signed-off-by: Shubham Patil <theshubhamp@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good to me.
Added a method to HTTP Utilities to parse out cookies into a map.
Changed oauth2 extension to lookups from a cookie map to reduce
redundant scanning of cookies.
Requested as part of PR raised for
Risk Level: Low
Testing: Existing tests pass & added a new test for coverage.
Docs Changes: N/A
Release Notes: N/A
Requested as part of the PR #17721
Related to #17424
Signed-off-by: Shubham Patil theshubhamp@gmail.com