Skip to content

basic_auth_header uses the wrong flavor of base64 #181

Closed
@Gallaecio

Description

@Gallaecio

I have reason to believe that basic_auth_header is wrong in using urlsafe_b64encode (which replaces +/ with -_) instead of b64encode.

The first specification of HTTP basic auth according to Wikipedia is HTTP 1.0, which does not mention any special flavor of base64, and points for a definition of base64 to RFC-1521, which describes regular base64. The latest HTTP basic auth specification according to Wikipedia is RFC-7617, which similarly does not specify any special flavor of base64, and points to section 4 of RFC-4648, which also describes the regular base64.

I traced the origin of this bug, and it has been there at least since the first Git commit of Scrapy.

>>> from w3lib.http import basic_auth_header

Actual:

>>> basic_auth_header('aa~aa¿', '')
b'Basic YWF-YWG_Og=='

Expected:

>>> basic_auth_header('aa~aa¿', '')
b'Basic YWF+YWG/Og=='

I believe this bug only affects ASCII credentials that include the >, ? or ~ characters in certain positions.

For richer encodings like UTF-8, which is what basic_auth_header uses (and makes sense as a default, but it should be configurable rightly so), many more characters can be affected (e.g. ¿ in the example above).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions