Skip to content

Commit 3b15e99

Browse files
committed
str2wcs: encode invalid Unicode characters in the private use area
Rust does not like invalid code points, so let's ease the transition by treating them like byte sequences that do not map to any code point. See fish-shell#9688 (comment)
1 parent 746019e commit 3b15e99

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/common.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,8 @@ static wcstring str2wcs_internal(const char *in, const size_t in_len) {
338338
// Determine whether to encode this character with our crazy scheme.
339339
if (wc >= ENCODE_DIRECT_BASE && wc < ENCODE_DIRECT_BASE + 256) {
340340
use_encode_direct = true;
341+
} else if ((wc >= 0xD800 && wc <= 0xDFFF) || static_cast<uint32_t>(wc) >= 0x110000) {
342+
use_encode_direct = true;
341343
} else if (wc == INTERNAL_SEPARATOR) {
342344
use_encode_direct = true;
343345
} else if (ret == static_cast<size_t>(-2)) {

0 commit comments

Comments
 (0)