Description
I have reason to believe that basic_auth_header
is wrong in using urlsafe_b64encode
(which replaces +/
with -_
) instead of b64encode
.
The first specification of HTTP basic auth according to Wikipedia is HTTP 1.0, which does not mention any special flavor of base64, and points for a definition of base64 to RFC-1521, which describes regular base64. The latest HTTP basic auth specification according to Wikipedia is RFC-7617, which similarly does not specify any special flavor of base64, and points to section 4 of RFC-4648, which also describes the regular base64.
I traced the origin of this bug, and it has been there at least since the first Git commit of Scrapy.
>>> from w3lib.http import basic_auth_header
Actual:
>>> basic_auth_header('aa~aa¿', '')
b'Basic YWF-YWG_Og=='
Expected:
>>> basic_auth_header('aa~aa¿', '')
b'Basic YWF+YWG/Og=='
I believe this bug only affects ASCII credentials that include the >
, ?
or ~
characters in certain positions.
For richer encodings like UTF-8, which is what basic_auth_header
uses (and makes sense as a default, but it should be configurable rightly so), many more characters can be affected (e.g. ¿
in the example above).