-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30458: Disallow control chars in http URLs. #12755
Conversation
Example possible fix for those issues.
diff --git a/Lib/test/test_urllib.py b/Lib/test/test_urllib.py
index 2ac73b58d8..99bd934765 100644
--- a/Lib/test/test_urllib.py
+++ b/Lib/test/test_urllib.py
@@ -329,6 +329,14 @@ class urlopen_HttpTests(unittest.TestCase, FakeHTTPMixin, FakeFTPMixin):
finally:
self.unfakehttp()
+ def test_url_newline(self):
+ self.fakehttp(b"HTTP/1.1 200 OK\r\n\r\nHello!")
+ try:
+ with self.assertRaises(ValueError):
+ resp = urlopen("http://127.0.0.1:1234/?q=HTTP/1.1\r\nHeader: Value")
+ finally:
+ self.unfakehttp()
+
def test_read_0_9(self):
# "0.9" response accepted (but not "simple responses" without
# a status line)
@@ -1512,6 +1520,11 @@ class RequestTests(unittest.TestCase):
request.method = 'HEAD'
self.assertEqual(request.get_method(), 'HEAD')
+ def test_url_newline(self):
+ Request = urllib.request.Request
+ with self.assertRaises(ValueError):
+ resp = Request("http://127.0.0.1:1234/?q=HTTP/1.1\r\nHeader: Value")
+
class URL2PathNameTests(unittest.TestCase):
diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py
index 8b6c9b1060..20242eee2d 100644
--- a/Lib/urllib/parse.py
+++ b/Lib/urllib/parse.py
@@ -997,6 +997,11 @@ def _splittype(url):
if _typeprog is None:
_typeprog = re.compile('([^/:]+):(.*)', re.DOTALL)
+ unquoted_url = unquote(url)
+ if (unquoted_url.startswith('http') and
+ re.search("[\x00-\x19\x7f-\x9f]", unquoted_url)):
+ raise ValueError("URL can't contain control characters. %r" % (unquoted_url,))
+
match = _typeprog.match(url)
if match:
scheme, data = match.groups() |
The initial reverted fix in golang was reverted as noted in golang/go#27302 (comment) because it took into account unicode characters in URL also as problematic CTL/invalid characters : golang/go@1040626#diff-6c2d018290e298803c0c9419d8739885R1136 and resorted to only using ASCII CTL characters. I assume this could be the internal test failures at Google where unicode values were present in URL. It's not a problem with Python because urllib already tries to decode using ASCII and therefore throws a
Line 971 in 750d74f
|
Sorry, this has been sending me circles from morning. Checking out the latest golang source this was further relaxed golang/go@f1d662f#diff-130fa30904f41d0d113c732265e91542R63 to use check only for space and 0x7f instead of anything between First fix golang/go@829c5df#diff-130fa30904f41d0d113c732265e91542 Latest fix reverting the above golang/go@f1d662f#diff-130fa30904f41d0d113c732265e91542R63 func stringContainsCTLByte(s string) bool {
for i := 0; i < len(s); i++ {
b := s[i]
if b < ' ' || b == 0x7f {
return true
}
}
return false
} Thanks, this seems to have been changed with 448a541 in this PR |
See also an older thread from Victor on fixing similar issues : https://mail.python.org/pipermail/python-dev/2017-July/148699.html |
Yep. I'm trying to make this fix as surgical as possible to only prevent the actual problem of allowing invalid paths with whitespace to pass through to the protocol level without changing public APIs of other functions such as parse.quote or (deprecated) parse.splittype and such. |
This seems to cause issue with
|
Actual test :
So
|
I would love to get some feedback on this. |
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (pythonGH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (pythonGH-13044) Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
GH-13154 is a backport of this pull request to the 3.7 branch. |
GH-13155 is a backport of this pull request to the 3.6 branch. |
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (pythonGH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (pythonGH-13044) Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (GH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (GH-13044) Backport Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (GH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (GH-13044) Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
|
|
|
|
|
|
|
|
|
|
|
|
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (pythonGH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (pythonGH-13044) Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
…H-13315) Disallow control chars in http URLs in urllib2.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (GH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use httplib.InvalidURL instead of ValueError as the new error case's exception. (GH-13044) Backport Co-Authored-By: Miro Hrončok <miro@hroncok.cz> (cherry picked from commit 7e200e0) Notes on backport to Python 2.7: * test_urllib tests urllib.urlopen() which quotes the URL and so is not vulerable to HTTP Header Injection. * Add tests to test_urllib2 on urllib2.urlopen(). * Reject non-ASCII characters: range 0x80-0xff.
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (GH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (GH-13044) Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
…honGH-13154) Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected. Disable https related urllib tests on a build without ssl (pythonGH-13032) These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures. Use http.client.InvalidURL instead of ValueError as the new error case's exception. (pythonGH-13044) Backport Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
They triggered CVE-2019-9740 checks added in python here [0]. The problematic test should fail because of invalid source address but it failed earlier because of invalid request URL. Request URLs contained string representation of the tested source address which can contain whitespaces. E.g. "/source_address?(\'192.0.2.255\', 0)" The source addresses seem to be there only for information and were added as part of [1]. Removing them from the request URL makes the tests pass again. [0] python/cpython#12755 [1] urllib3#703
bpo-36276 and bpo-30458:
disallow control characters such as whitespace in urls put into the raw http protocol within http.client.
https://bugs.python.org/issue36276
https://bugs.python.org/issue30458