Skip to content

Incorrect handling of start and end values in codecs error handlers #126004

Closed
@picnixz

Description

@picnixz

Crash report

What happened?

./python -c "import codecs; codecs.xmlcharrefreplace_errors(UnicodeEncodeError('bad', '', 0, 1, 'reason'))"
python: ./Include/cpython/unicodeobject.h:339: PyUnicode_READ_CHAR: Assertion `index >= 0' failed.
Aborted (core dumped)
./python -c "import codecs; codecs.backslashreplace_errors(UnicodeDecodeError('utf-8', b'00000', 9, 2, 'reason'))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
SystemError: Negative size passed to PyUnicode_New
./python -c "import codecs; codecs.replace_errors(UnicodeTranslateError('000', 1, -7, 'reason'))"
python: Python/codecs.c:743: PyCodec_ReplaceErrors: Assertion `PyUnicode_KIND(res) == PyUnicode_2BYTE_KIND' failed.
Aborted (core dumped)

See #123378 for the root cause. Since we are still wondering how to fix the getters and setters, I suggest we first fix the crash by adding the checks inside at the handler's level (for now). I'm not sure if the handler itself is handling corner cases correctly as well.

Linked PRs

Metadata

Metadata

Assignees

Labels

3.14bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)topic-C-APItype-crashA hard crash of the interpreter, possibly with a core dump

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions