UnicodeSet parser does not support all code points #3893
Labels
C-transliterator
Component: transliterator
good first issue
Good for newcomers
help wanted
Issue needs an assignee
T-bug
Type: Bad behavior, security, privacy
Milestone
The unescaping code in
icu_unicodeset_parser
only works for scalar values (Rustchar
's), when all code points should be supported (anyu32
below or equalchar::MAX
). Should be relatively straightforward to fix by replacing chars with u32s and aval <= char::MAX as u32
check instead ofchar::try_from
inparse_escaped_char
.This currently fails, but should pass:
icu_unicodeset_parser::parse(r"[^\uD800-\uE0FF]")
The text was updated successfully, but these errors were encountered: