Skip to content

Conversation

@vitorpamplona
Copy link
Contributor

@vitorpamplona vitorpamplona commented Sep 16, 2025

Basically, the combineSurrogates logic doesn't run when the two characters of a surrogate are exactly at the end of one segment and the beginning of another.

Fixes #1473

@cowtowncoder
Copy link
Member

note: I think src/main/java/com/fasterxml/jackson/core/json/WriterBasedJsonGenerator.java wouldn't be affected as it's up to underlying java.io.Writer to handle escaping.

@cowtowncoder cowtowncoder added the cla-needed PR looks good (although may also require code review), but CLA needed from submitter label Sep 16, 2025
@cowtowncoder
Copy link
Member

Looks good overall, happy to get the fix!

One process thing: we'd need CLA (unless you have sent one earlier). It's here:

https://github.com/FasterXML/jackson/blob/main/contributor-agreement.pdf

and the usual way is to print, fill & sign, scan/photo, email to cla at fasterxml dot com.

Once I get it I can do final review, approve and merge PR.

I think this is also safe enough to go in 2.20 patch (I can cherry-pick the fix).

@vitorpamplona
Copy link
Contributor Author

Just sent the CLA

@cowtowncoder cowtowncoder added cla-received PR already covered by CLA (optional label) and removed cla-needed PR looks good (although may also require code review), but CLA needed from submitter labels Sep 17, 2025
jp.close();
}

@Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move to test class that verifies #223 handling (where COMBINE_UNICODE_SURROGATES_IN_UTF8 implemented).

Copy link
Member

@cowtowncoder cowtowncoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug (not using offset) and infinite-loop edge case (len == 1) to address.

Copy link
Member

@cowtowncoder cowtowncoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cowtowncoder
Copy link
Member

Looks good, hoping to sanity check later, merge.

@cowtowncoder cowtowncoder merged commit b4acc47 into FasterXML:2.x Sep 19, 2025
7 checks passed
cowtowncoder pushed a commit that referenced this pull request Sep 19, 2025
@cowtowncoder
Copy link
Member

Merged into 2.x, backported in 2.20 (for 2.20.1).

Thank you @vitorpamplona ! This is very useful fix (even if not common bug).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-received PR already covered by CLA (optional label)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug with Surrogates + Segments in the UTF8JsonGenerator

3 participants