WritableStream: example, fixed block character in the output #14371

OnkarRuikar · 2022-03-27T13:10:54Z

Summary

In the example https://developer.mozilla.org/en-US/docs/Web/API/WritableStream#examples ,
there are unwanted block characters in the output:

In JavaScript strings use UTF-16 encoding.
With utf-16 encoding for the decoder, the issue is fixed:

const decoder = new TextDecoder("utf-16");

After:

Tested in Chrome on Windows 10.

Note: I've created a PR for the same in mdn/dom-examples repo: mdn/dom-examples#97
I have no idea who is the reviewer there.

Metadata

Fixes a typo, bug, or other error

sideshowbarker · 2022-03-28T01:09:25Z

In the example, there are unwanted block characters in the output
Tested in Chrome on Windows 10.

I can’t reproduce that in Chrome in my macOS environment — though I can reproduce it in Safari.

But regardless, it seems to me the cause may actually be browser bugs, not a problem with the code.

In JavaScript strings use UTF-16 encoding.

While that’s true about how JavaScript encodes strings internally, that’s not true at the API layer for TextDecoder.

With utf-16 encoding for the decoder, the issue is fixed:
const decoder = new TextDecoder("utf-16");

The code here is doing const encoder = new TextEncoder(). Per https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder and the Encoding spec, that means the encoder encodes the stream in UTF-8. (There is in fact no other way to use TextEncoder() to encode in anything other than UTF-8.)

So the const decoder = new TextDecoder("utf-8") part of the existing code causes the stream to be decoded using the UTF-8 decoder — which, since it was encoded in UTF-8, seems like it’s right. And so, doing const decoder = new TextDecoder("utf-16") would be wrong — because the stream was not encoded in UTF-16, it was encoded in UTF-8. (And In fact, just new TextDecoder() should work — because the UTF-8 decoder is the default).

OnkarRuikar · 2022-03-28T07:32:09Z

@sideshowbarker
To be on a same page lets look at a common example https://jsfiddle.net/vx5ea9z1/2/
Here I've used the same JavaScript code as provided in the examples.
In addition, at the end of the list I am printing received message length. And also printed Extended ASCII characters, which include non printable characters as well.

I can’t reproduce that in Chrome in my macOS environment — though I can reproduce it in Safari.

Some browsers hide non-printable characters. Above JSfiddle on Windows in Chrome and Edge shows all extended ASCII chars. Non-printable characters are shown as empty boxes. Same JSFiddle on ipad Safari and Chrome doesn't show non printable characters.

But regardless, it seems to me the cause may actually be browser bugs, not a problem with the code.

We can easily verify if it's a a browser bug. In above JSFiddle, in the output, received message length is 26 in both Windows and iPadOs.
Chrome and Edge on Windows:

Safari and Chrome on iPadOs:

Can you check the same in Chrome on macOS?
If the length is 26 on all browsers then it's a code bug. Because original message has only 13 characters and received message has 26. How come the message size doubled? Then there is something wrong in decoding logic.

So the const decoder = new TextDecoder("utf-8") part of the existing code causes the stream to be decoded using the UTF-8 decoder — which, since it was encoded in UTF-8, seems like it’s right.

You are right. utf-8 encoded string should be decoded using utf-8 decoder. After debugging more it looks like following lines of decoding logic are making string double in size:

var buffer = new ArrayBuffer(2);
var view = new Uint16Array(buffer);
view[0] = chunk;
var decoded = decoder.decode(view, { stream: true });

By using Uint16Array, the view[1] remains empty0. And doubles the output size.
We can use Uint8Array to decode utf-8 encoded string.

var buffer = new ArrayBuffer(1);
var view = new Uint8Array(buffer);
view[0] = chunk;
var decoded = decoder.decode(view, { stream: true });

This solved the issue on windows and ipadOs. Let me know if it works ok on your end: https://jsfiddle.net/onqp6xvt/

WritableStream: example, fixed block character in the output

e2e4984

OnkarRuikar requested a review from a team as a code owner March 27, 2022 13:10

OnkarRuikar requested review from sideshowbarker and removed request for a team March 27, 2022 13:10

github-actions bot added the Content:WebAPI Web API docs label Mar 27, 2022

This comment was marked as outdated.

Sign in to view

OnkarRuikar added 7 commits March 27, 2022 18:50

fixed encoding

c44b33f

fixed encoding

b463908

fixed encoding

63f79e3

fixed encoding

1790d16

fixed encoding

4d8995c

fixed encoding

fdda43c

fixed encoding

30c5c6b

sideshowbarker mentioned this pull request Mar 28, 2022

Streams/simple-writer: fixed block character in the output. mdn/dom-examples#97

Merged

as per new findings updated the fix

b7b60ca

sideshowbarker approved these changes Mar 31, 2022

View reviewed changes

sideshowbarker merged commit 88ecdef into mdn:main Mar 31, 2022

OnkarRuikar deleted the patch-2 branch March 31, 2022 01:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WritableStream: example, fixed block character in the output #14371

WritableStream: example, fixed block character in the output #14371

OnkarRuikar commented Mar 27, 2022 •

edited

Loading

This comment was marked as outdated.

sideshowbarker commented Mar 28, 2022

OnkarRuikar commented Mar 28, 2022 •

edited

Loading

WritableStream: example, fixed block character in the output #14371

WritableStream: example, fixed block character in the output #14371

Conversation

OnkarRuikar commented Mar 27, 2022 • edited Loading

Summary

Metadata

This comment was marked as outdated.

sideshowbarker commented Mar 28, 2022

OnkarRuikar commented Mar 28, 2022 • edited Loading

OnkarRuikar commented Mar 27, 2022 •

edited

Loading

OnkarRuikar commented Mar 28, 2022 •

edited

Loading