-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix utils.encoding.auto_decode() LookupError with invalid encodings #6311
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor typo fixes, but otherwise this LGTM. I've only reviewed over the web, and the CI checks hand't completed when I did so, so obviously those need to succeed too...
amended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch! A couple comments.
utils.encoding.auto_decode() was broken when decoding Big Endian BOM byte-strings on Little Endian or vice versa. The TestEncoding.test_auto_decode_utf_16_le test was failing on Big Endian systems, such as Fedora's s390x builders. A similar test, but with BE BOM test_auto_decode_utf_16_be was added in order to reproduce this on a Little Endian system (which is much easier to come by). A regression test was added to check that all listed encodings in utils.encoding.BOMS are valid. Fixes pypa#6054
amended |
Thanks, @hroncok! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
utils.encoding.auto_decode()
was broken when decoding Big Endian BOM byte-strings on Little Endian or vice versa.The
TestEncoding.test_auto_decode_utf16_le
test was failing on Big Endian systems, such as Fedora's s390x builders. A similar test, but with BE BOMtest_auto_decode_utf16_be
was added in order to reproduce this on a Little Endian system (which is much easier to come by).A regression test was added to check that all listed encodings in
utils.encoding.BOMS
are valid.Fixes #6054