Description
Bug report
Bug description:
Now, let's open a flask app here:
from flask import Flask, make_response
app = Flask(__name__)
@app.route('/')
def set_cookie():
response = make_response("Cookie has been set!")
response.set_cookie(
'foo',
value='bar',
)
return response
if __name__ == '__main__':
app.run()
This web app set a cookie foo=bar
. Then, we use http.cookiejar to process it:
import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy
policy = DefaultCookiePolicy(blocked_domains=['']) # no blockers
cj = CookieJar(policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://127.0.0.1:5000")
for item in cj:
print('Name = %s' % item.name)
print('Value = %s' % item.value)
# this should return
'''
Cookie has been set!
Name = foo
Value = bar
'''
blocked_policy = DefaultCookiePolicy(blocked_domains=["127.0.0.1"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://127.0.0.1:5000")
for item in cj:
print('Name = %s' % item.name)
print('Value = %s' % item.value)
# this should return
'''
Cookie has been set!
'''
Everything goes well right? BUT if we open the flask app in IPv6 host:
from flask import Flask, make_response
app = Flask(__name__)
@app.route('/')
def set_cookie():
response = make_response("Cookie has been set!")
response.set_cookie(
'foo',
value='bar',
)
return response
if __name__ == '__main__':
app.run(host='::1')
Then we use cookiejar to process:
import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy
policy = DefaultCookiePolicy(blocked_domains=['']) # no blockers
cj = CookieJar(policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
print('Name = %s' % item.name)
print('Value = %s' % item.value)
# this should return
'''
Cookie has been set!
Name = foo
Value = bar
'''
blocked_policy = DefaultCookiePolicy(blocked_domains=["[::1]"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
print('Name = %s' % item.name)
print('Value = %s' % item.value)
# this should return
'''
Cookie has been set!
Name = foo
Value = bar
'''
blocked_policy = DefaultCookiePolicy(blocked_domains=["::1"]) # block cookies
cj = CookieJar(blocked_policy)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://[::1]:5000")
for item in cj:
print('Name = %s' % item.name)
print('Value = %s' % item.value)
# this should return
'''
Cookie has been set!
Name = foo
Value = bar
'''
NO COOKIES ARE BLOCKED.
I've found the problem in func http.cookiejar.DefaultCookiePolicy.is_blocked
def is_blocked(self, domain):
for blocked_domain in self._blocked_domains:
if user_domain_match(domain, blocked_domain):
return True
return False
it use func user_domain_match
, as below:
def user_domain_match(A, B):
"""For blocking/accepting domains.
A and B may be host domain names or IP addresses.
"""
A = A.lower()
B = B.lower()
if not (liberal_is_HDN(A) and liberal_is_HDN(B)):
if A == B:
# equal IP addresses
return True
return False
initial_dot = B.startswith(".")
if initial_dot and A.endswith(B):
return True
if not initial_dot and A == B:
return True
return False
Well, it seems like we are using liberal_is_HDN
func to check if A and B are whether HDN or IP addr. the func is as below:
def liberal_is_HDN(text):
"""Return True if text is a sort-of-like a host domain name.
For accepting/blocking domains.
"""
if IPV4_RE.search(text):
return False
return True
Well, the IPV4_RE regex:
IPV4_RE = re.compile(r"\.\d+$", re.ASCII)
Now, since the program only check IPv4, our addr of IPv6 is forever a HDN, which is completely wrong. So the user_domain_match
func forever returns False because it don't have an initial dot.
And instead of blocked_domains
we've also got allow_domains
which use the same logic and always returns False.
Why does it retr False? because the IPV6 addr will be added a .local
on the end since its been treaded as a abnormal HDN. So when it comes to user_domain_match
func, A is [::1].local
and B is [::1]
That is, every IPv6 addr will be allowed in the DefaultCookiePolicy no matter what blocked_domains is set.
as the DefaultCookiePolicy mainly focused on privacy issues, this could cause some bypassing tho, so things is getting quiet serious here I think.
This issue is previously disscused in:
#135500
https://discuss.python.org/t/support-ipv6-in-http-cookiejar-when-deciding-whether-a-string-is-a-hdn-or-a-ip-addr/95439
And my previous solution is at #135502 which use ipaddress.ip_address()
to identify it, which is NOT good because the IPv6 addr is wrapped in []
. I am writing the tests script and completing the PR now.
@ericvsmith Thanks!
CPython versions tested on:
3.14
Operating systems tested on:
Windows
Linked PRs
Metadata
Metadata
Assignees
Projects
Status