Skip to content

JSONC parser fails to correctly parse non-BMP escape sequences #31

Open
@KiloJuliett

Description

@KiloJuliett

In accordance with RFC 8258 § 7, the non-BMP character 𝄞 (U+1D11E) should be escaped as the escaped surrogate pair \uD834\uDD1E. Therefore, I expect the following Rust code to compile and run successfully:

use jsonc_parser::JsonValue;
use jsonc_parser::parse_to_value;

fn main() {
    let src = r#""\uD834\uDD1E""#;
    let v = parse_to_value(src, &Default::default()).unwrap().unwrap();
    if let JsonValue::String(s) = v {
        assert_eq!("\u{1D11E}", s)
    }
    else {
        panic!();
    }
}

However, on the latest version of jsonc-parser (as of writing, this is version 0.21.0), this code panics at the unwrap on line 6 with the message "Invalid unicode escape sequence. 'D834' is not a valid UTF8 character".

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions