Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix utils.encoding.auto_decode() LookupError with invalid encodings #6311

Merged
merged 1 commit into from
Mar 1, 2019

Conversation

hroncok
Copy link
Contributor

@hroncok hroncok commented Mar 1, 2019

utils.encoding.auto_decode() was broken when decoding Big Endian BOM byte-strings on Little Endian or vice versa.

The TestEncoding.test_auto_decode_utf16_le test was failing on Big Endian systems, such as Fedora's s390x builders. A similar test, but with BE BOM test_auto_decode_utf16_be was added in order to reproduce this on a Little Endian system (which is much easier to come by).

A regression test was added to check that all listed encodings in utils.encoding.BOMS are valid.

Fixes #6054

Copy link
Member

@pfmoore pfmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor typo fixes, but otherwise this LGTM. I've only reviewed over the web, and the CI checks hand't completed when I did so, so obviously those need to succeed too...

news/6054.bugfix Outdated Show resolved Hide resolved
tests/unit/test_utils.py Outdated Show resolved Hide resolved
@hroncok
Copy link
Contributor Author

hroncok commented Mar 1, 2019

amended

Copy link
Member

@cjerdonek cjerdonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! A couple comments.

tests/unit/test_utils.py Outdated Show resolved Hide resolved
tests/unit/test_utils.py Show resolved Hide resolved
utils.encoding.auto_decode() was broken when decoding Big Endian BOM
byte-strings on Little Endian or vice versa.

The TestEncoding.test_auto_decode_utf_16_le test was failing on Big Endian
systems, such as Fedora's s390x builders. A similar test, but with BE BOM
test_auto_decode_utf_16_be was added in order to reproduce this on a Little
Endian system (which is much easier to come by).

A regression test was added to check that all listed encodings in
utils.encoding.BOMS are valid.

Fixes pypa#6054
@hroncok
Copy link
Contributor Author

hroncok commented Mar 1, 2019

amended

@cjerdonek cjerdonek merged commit 4589ed4 into pypa:master Mar 1, 2019
@cjerdonek
Copy link
Member

Thanks, @hroncok!

@hroncok hroncok deleted the i6054 branch March 6, 2019 09:59
@lock
Copy link

lock bot commented May 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators May 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LookupError: unknown encoding: utf16-le
3 participants