Skip to content

Bug: exception in noexcept what() when Python exception contains a surrogate character #4288

Closed
@Skylion007

Description

@Skylion007

Discussed in #4287

Originally posted by TheShiftedBit October 26, 2022
By default, Python produces errors when converting encoding strs with utf-8 if the str contains surrogate characters. This can be disabled by passing surrogatepass as a second argument to .encode(). Pybind11 has this same behavior with its str -> std::string conversion. However, the bug is this: if an exception message contains a surrogate character, calling .what() on an error_already_set with such an exception causes another exception to be thrown, but since .what() is noexcept, that exception cannot be caught and the program std::terminates.

I'm not sure what the correct behavior regarding surrogate characters is. Perhaps pybind11 should always use surrogatepass, perhaps not. However, even if that's not the right choice, it should probably use it during exception handling, or Python exceptions like this are extremely difficult to diagnose.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions