-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http parser non strict mode #2332
Conversation
Do you want to hop on the bus at last second? |
Well, I just don't know where in docs should I put a note on this or a news in changes is enough?.. |
Hmm, strict mode requires compilation from sources, isn't it? |
True |
… to control this mode
9e2b7e7
to
50fa505
Compare
Dropped unneeded control flags. |
@@ -50,7 +50,7 @@ cov-dev: .develop | |||
@echo "Run without extensions" | |||
@AIOHTTP_NO_EXTENSIONS=1 py.test --cov=aiohttp tests | |||
@py.test --cov=aiohttp --cov-report=term --cov-report=html --cov-append tests | |||
@echo "open file://`pwd`/coverage/index.html" | |||
@echo "open file://`pwd`/coverage/index.html" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
I was wondering why I don't see the line in console output :)
@@ -0,0 +1 @@ | |||
Build http-parser extension in non-strict mode by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want non-strict mode by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, does this switch implies that we have to ship two wheels with strict and non strict parsers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
We are strict on sending HTTP data but relaxed on receiving the data from foreign servers.
Basically we do the same in very many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According the sources, this check does a bit more than just allow non-ascii characters in URL. If I read correctly, users may send "GET / OLOLO/1.1" requests without worry about, servers may return malformed status codes, etc. I don't think we want to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET / OLOLO/1.1
does not work — have just checked this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, then I read it wrong way. What exact it controls then? Can't found any documentation about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After brief overview I've found the following behavior changes for non-strict mode.
It allows
- spaces, tabs and linefeeds in headers
- utf-8 in URLs
- underscore in hostname
\9
(ht) and\12
(np) symbols.
Also looks like it has relaxed checks for \r\n
(\r
and \n
are allowed).
"GET / OLOLO/1.1" if forbidden but "GET / HLOL/1.1" is allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asvetlov Thanks!
Are we ok with 1, 3 and 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes. It doesn't raise security issues IMHO
What's the behavior of pure Python parser btw? |
IMHO it works in relaxed mode. |
pure-python parser works in relaxed mode |
50fa505
to
337bc01
Compare
Update the test to use all parsers |
Codecov Report
@@ Coverage Diff @@
## master #2332 +/- ##
=======================================
Coverage 97.23% 97.23%
=======================================
Files 39 39
Lines 8224 8224
Branches 1442 1442
=======================================
Hits 7997 7997
Misses 98 98
Partials 129 129 Continue to review full report at Codecov.
|
@fafhrd91 what is your opinion? Is it ok to ship C parser in relaxed mode? |
I think it is fine |
Thanks @popravich |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
What do these changes do?
Build http-parser in non-strict mode.
By default http-parser was built in strict mode.
This was causing requests having non-ascii URLs in request line to be
rejected with
InvalidURLErrors
.Although it is considered to be invalid HTTP request most servers handle such requests.
So it is decided to build http-parser in non-strict mode by default.
(more discussion here on gitter)
This PR allows to control http-parse mode by forwarding
HTTP_PARSER_STRICT
environmentvariable to extension compiler.
Checklist
changes
folder