Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http parser non strict mode #2332

Merged
merged 7 commits into from
Oct 17, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ cov-dev: .develop
@echo "Run without extensions"
@AIOHTTP_NO_EXTENSIONS=1 py.test --cov=aiohttp tests
@py.test --cov=aiohttp --cov-report=term --cov-report=html --cov-append tests
@echo "open file://`pwd`/coverage/index.html"
@echo "open file://`pwd`/coverage/index.html"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
I was wondering why I don't see the line in console output :)


cov-ci-no-ext: .develop
@echo "Run without extensions"
Expand Down
1 change: 1 addition & 0 deletions changes/2332.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Build http-parser extension in non-strict mode by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want non-strict mode by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this switch implies that we have to ship two wheels with strict and non strict parsers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?
We are strict on sending HTTP data but relaxed on receiving the data from foreign servers.
Basically we do the same in very many places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According the sources, this check does a bit more than just allow non-ascii characters in URL. If I read correctly, users may send "GET / OLOLO/1.1" requests without worry about, servers may return malformed status codes, etc. I don't think we want to support this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GET / OLOLO/1.1 does not work — have just checked this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, then I read it wrong way. What exact it controls then? Can't found any documentation about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After brief overview I've found the following behavior changes for non-strict mode.

It allows

  1. spaces, tabs and linefeeds in headers
  2. utf-8 in URLs
  3. underscore in hostname
  4. \9 (ht) and \12 (np) symbols.

Also looks like it has relaxed checks for \r\n (\r and \n are allowed).
"GET / OLOLO/1.1" if forbidden but "GET / HLOL/1.1" is allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asvetlov Thanks!

Are we ok with 1, 3 and 4?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes. It doesn't raise security issues IMHO

5 changes: 4 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,13 @@

ext = '.pyx' if USE_CYTHON else '.c'


extensions = [Extension('aiohttp._websocket', ['aiohttp/_websocket' + ext]),
Extension('aiohttp._http_parser',
['aiohttp/_http_parser' + ext,
'vendor/http-parser/http_parser.c'],),
'vendor/http-parser/http_parser.c'],
define_macros=[('HTTP_PARSER_STRICT', 0)],
),
Extension('aiohttp._frozenlist',
['aiohttp/_frozenlist' + ext])]

Expand Down
13 changes: 13 additions & 0 deletions tests/test_http_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,19 @@ def test_partial_url(parser):
assert payload.is_eof()


def test_url_parse_non_strict_mode(parser):
payload = 'GET /test/тест HTTP/1.1\r\n\r\n'.encode('utf-8')
messages, upgrade, tail = parser.feed_data(payload)
assert len(messages) == 1

msg, payload = messages[0]

assert msg.method == 'GET'
assert msg.path == '/test/тест'
assert msg.version == (1, 1)
assert payload.is_eof()


class TestParsePayload(unittest.TestCase):

def setUp(self):
Expand Down