-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parsing the Forwarded header #2173
Conversation
This fixes #2170 by parsing Forwarded more carefully, while staying about as fast, simple, and robust in the face of potential injection attacks. (Speed was measured with IPython's %timeit on my laptop on a few typical and pathological header values.) In particular: - commas and semicolons are allowed inside quoted-strings; - empty forwarded-pairs (as in "for=_1;;by=_2") are allowed; - non-standard parameters are allowed (although this alone could be easily done in the previous parser). This still doesn't parse valid headers containing obs-text, which was an intentional decision in the previous parser (see comments) that I did not change. Also, the previous parser used to bail out of forwarded-elements containing duplicate parameter names. No rationale was given in the code, and I don't think this is important, so the new parser doesn't enforce this.
r'\1', value[1:-1]) | ||
fvparams[param.lower()] = value | ||
elems = [] | ||
for field_value in self._message.headers.getall(hdrs.FORWARDED, ()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fafhrd91 @asvetlov
It is easily possible to build FSM which will parse this structure using regex
module. Unfortunatelly built-in regexps are not able to capture all repeating matches.
Pros:
- Less code
- Less error-prone
- Faster
- Ability to parse other structures using higly-efficient FSMs
Cons:
- Dependency on regex module
- More difficult to understand (unsure, maybe more simplier)
- Non-pythonic way.
(Anyway, maybe merge as shown here, but re-write later, I can do that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@socketpair please merge the PR first than feel free to create a new one for regexps based approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asvetlov how can I sure that other members agree with my decision to merge ? (sorry, I'm asking BEFORE making something awful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@socketpair I’m not sure an FSM can handle my test_single_forwarded_header_injection1
case, which necessarily involves a form of backtracking. And I think this case is important to handle, at least until deployment experience shows that all reverse proxies take precautions against that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@socketpair the trick borrowed from CPython works: declare intention to merge (LGTM or github pull request approval), wait a day and do merge tomorrow if nobody objects.
r'\1', value[1:-1]) | ||
fvparams[param.lower()] = value | ||
elems = [] | ||
for field_value in self._message.headers.getall(hdrs.FORWARDED, ()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@socketpair please merge the PR first than feel free to create a new one for regexps based approach.
Codecov Report
@@ Coverage Diff @@
## master #2173 +/- ##
==========================================
+ Coverage 97.11% 97.14% +0.02%
==========================================
Files 39 39
Lines 7878 7889 +11
Branches 1366 1367 +1
==========================================
+ Hits 7651 7664 +13
+ Misses 101 100 -1
+ Partials 126 125 -1
Continue to review full report at Codecov.
|
Thanks |
What do these changes do?
Parse the
Forwarded
header more carefully, while staying about as fast, simple, and robust in the face of potential injection attacks. (Speed was measured with IPython’s%timeit
on my laptop on a few typical and pathological header values.)Are there changes in behavior for the user?
Most valid
Forwarded
headers that used to be parsed incorrectly by aiohttp are now parsed correctly. In particular:for=_1;;by=_2
) are allowed;Valid headers containing obs-text are still not parsed correctly, as this was an intentional decision in the previous parser (see comments) that I did not change.
Also, the previous parser used to bail out of (invalid) forwarded-elements containing duplicate parameter names. No rationale was given in the code, and I don’t think this is important, so the new parser doesn't enforce this.
Related issue number
#2170
Checklist
CONTRIBUTORS.txt
changes
folder<issue_id>.<type>
for example (588.bug)issue_id
change it to the pr id after creating the pr.feature
: Signifying a new feature..bugfix
: Signifying a bug fix..doc
: Signifying a documentation improvement..removal
: Signifying a deprecation or removal of public API..misc
: A ticket has been closed, but it is not of interest to users.