Skip to content

Prevent corruption of UTF-8 multibyte codepoints at fragment boundary #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

eroosenmaallen
Copy link

Problem:

  • When a string payload is fragmented, it was possible for the boundary to fall within a multibyte code point. When the payload is converted to a string with each fragment, it's possible to have corrupted codepoints at the start/end of fragments, leading to data corruption.

Solution:

  • Concat buffers as before, but do not convert to a string and populate responseText until all fragments have been received.

Co-authored-by: Paul Ringseth paul@distributive.network

….responseText only once, after self.response has been fully reassembled.

Co-authored-by: Paul Ringseth <paul@distributive.network>
Co-Authored-By: Paul Ringseth <paul@distrivutive.network>
Copy link
Owner

@mjwwit mjwwit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is absolutely brilliant! I'm truly sorry I just got around to reviewing this, I'm not sure how I missed it. Thanks for the very complete unit-test as well!!

@mjwwit mjwwit merged commit cf57429 into mjwwit:master Jul 16, 2023
@mjwwit
Copy link
Owner

mjwwit commented Jul 16, 2023

I've just released version 2.1.1 on npm which includes your fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants