Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request body with Protobuf binary data is corrupted #1442

Closed
4 tasks done
DGollings opened this issue Oct 22, 2022 · 15 comments · Fixed by #1436
Closed
4 tasks done

Request body with Protobuf binary data is corrupted #1442

DGollings opened this issue Oct 22, 2022 · 15 comments · Fixed by #1436
Labels
bug Something isn't working needs:triage Issues that have not been investigated yet. scope:browser Related to MSW running in a browser

Comments

@DGollings
Copy link
Contributor

DGollings commented Oct 22, 2022

Prerequisites

Environment check

  • I'm using the latest msw version
  • I'm using Node.js version 14 or higher

Browsers

Chromium (Chrome, Brave, etc.), Firefox

Reproduction repository

not easily possible

Reproduction steps

Good news, I saw some issues with people having problems(?) mocking binary protobuf messages. I can confirm it works though for Twirp (which is like gRPC but simplified). Simply encode and decode with a properly set header

Well, mostly:
My input

const input = new Uint8Array([138, 1, 6, 10, 4, 10, 2, 32, 1]);

I can encode/decode without issues before sending to msw, but in the mock itself it turns into either:

const usingArrayBuffer = new Uint8Array(await req.arrayBuffer());
value usingArrayBuffer = Uint8Array([239, 191, 189, 1, 6, 10, 4, 10, 2, 32, 1]);

or with a little (unsuccessful) hacking:

const usingText = new Uint8Array((await req.text()).split("").map(x => x.charCodeAt( 0 )));
value usingText = [ 253, 1, 6, 10, 4, 10, 2, 32, 1 ]

usingText[0] = 138 'fixes it', but unfortunately doesn't work in all cases

Notice the values [239, 191, 189, ...], that's a unicode replacement character. As in, the arrayBuffer has been at some point converted from raw data to a string and 'back' (?)

A likely important note: I use msw-storybook-addon

EDIT: whilst writing this I think I realize the problem. That addon has version

  "peerDependencies": {
    "msw": ">=0.35.0 <1.0.0"
  },

set. Which doesn't contain the fixes I think I need.

I'll likely have to open an issue there. But leaving this here for completeness/cross reference

Current behavior

arrayBuffer is stringified

Expected behavior

arrayBuffer is completely untouched

@DGollings DGollings added bug Something isn't working needs:triage Issues that have not been investigated yet. scope:browser Related to MSW running in a browser labels Oct 22, 2022
@DGollings DGollings changed the title Protobuf binary being corruped in request Request body with Protobuf binary data is corrupted Oct 22, 2022
@DGollings
Copy link
Contributor Author

Related issue:#929

@DGollings
Copy link
Contributor Author

Opened issue in msw-storybook-addon, but whilst writing that I realized this:

Then I realized that I manually added the latest msw and used that to generate the mockServiceWorker.js in public (Vue) and I can and have been using the new methods described here: https://github.com/mswjs/msw/releases/tag/v0.44.0

So I'm quite unsure at this point whether I'm using ^0.3.x or ^0.40.x. Would you have any idea as to how to verify?

@DGollings
Copy link
Contributor Author

DGollings commented Oct 22, 2022

As mentioned above, added a test for expected behaviour. Considering its the exact same error this means that I likely have been using 0.47 in storybook, even though the addon says 0.35+

@95th
Copy link
Collaborator

95th commented Oct 23, 2022

It's not msw's fault. It's the encoding and decoding. It also has nothing to do with protobuf.

The following example fails:

test('fails', async () => {
  const input = new Uint8Array([138, 1, 6, 10, 4, 10, 2, 32, 1])
  const text = new util.TextDecoder().decode(input)
  const bytes = new Uint8Array(new util.TextEncoder().encode(text).buffer)
  expect(bytes).toEqual(input)
})

@DGollings
Copy link
Contributor Author

DGollings commented Oct 23, 2022

It's quite possible that I made a mistake in this attempt at reproducing the issue. I'm also quite aware this random array of values isn't protobuf in itself. But it does come from protobuf and figured it might help make the issue more 'real' :)

Anyway, if I input 9 characters as a uint8array (iirc) then text().length is 9, arrayBuffer().length is 11

So something in that array buffer is doing something encoding wise and I can't figure out how to properly convert text to a buffer. Is there an even more 'raw' way to access the body? I simply want the exact same data to come out somehow

@DGollings
Copy link
Contributor Author

for completeness sake, here's the actual process

const login = async (req,  res,  ctx) => {
  const user = User.decode(
    new Uint8Array(await req.arrayBuffer()) // arrayBuffer -> UInt8Array -> object
  );
  // panic, if the req.arrayBuffer has unexpected characters, see my included example. Not all messages, just certain messages
  return res(
    ctx.status(200),
    ctx.set("Content-Type", "application/protobuf"),
    ctx.body(Token.encode(
    {
      token:
        "eyJ(..)KrmD",
    },
  ).finish();) // object -> UInt8Array
);

It;s quite possible this isn't the fault of MSW, although I would expected the .text() to change the body, not .arrayBuffer(). But I would argue that MSW needs some way of returning a completely untouched and unchanged request body

@kettanaito
Copy link
Member

Hey, @DGollings. Thanks for reporting this.

We've done some fundamental improvements to how MSW handles ArrayBuffer in #1436, so let's try your failing test once that change is merged. Thank you for contributing the test!

@DGollings
Copy link
Contributor Author

Np, how stable is that branch in #1436 ? Are my assumptions in #1442 (comment) correct? If so, I'll happily generate a new mockServiceWorker.js and try to help test it

@kettanaito
Copy link
Member

The branch is stable but you won't be able to build it just yet. It has a few dependencies on other pull requests/packages to be merged/released before it can be built correctly. Also, it's not as much about the worker script but rather about how we handle requests/responses internally (well, the changes affect the worker a little). I don't think that swapping a worker script would be enough to test whether it fixes this issue. Also, it's a huge breaking change with its own migration guidelines (more on that once we publish a release candidate).

@jereklas
Copy link

jereklas commented Oct 26, 2022

I am also hitting this exact same issue. Having dug into it a bit, the issues is the following (and maybe #1436 fixes it? I wasn't sure where begin looking since that diff contains 265 changed files).

Below is a snippet of code from the generated mockServiceWorker.js from version 0.47.4. The body is converted into a string via the await request.text() call. This call to request.text() is converting all bytes > 127 into the special character .

  const clientMessage = await sendToClient(client, {
    type: 'REQUEST',
    payload: {
      id: requestId,
      url: request.url,
      method: request.method,
      headers: Object.fromEntries(request.headers.entries()),
      cache: request.cache,
      mode: request.mode,
      credentials: request.credentials,
      destination: request.destination,
      integrity: request.integrity,
      redirect: request.redirect,
      referrer: request.referrer,
      referrerPolicy: request.referrerPolicy,
      body: await request.text(), // this is the problematic line
      bodyUsed: request.bodyUsed,
      keepalive: request.keepalive,
    },
  })

Inside of the msw handler, when you call req.arrayBuffer() the encoding back to binary is occurring based on the string from request.text(), which has already incorrectly made all bytes > 127 be the character, which when encoded back to bytes becomes [239, 191, 189].

Original Byte Array Array after req.arrayBuffer() in msw handler function
[127] [127]
[128] [239, 191, 189]
[191] [239, 191, 189]
[138, 1, 6, 10, 4, 10, 2, 32, 1] [239, 191, 189, 1, 6, 10, 4, 10, 2, 32, 1]

I don't have a great understanding of the limitations on what structure the data has to be in for a service worker, but assuming it has to be text, could you do something like base64 encode the byte array and then undo it in the correspond arrayBuffer() call?

@DGollings

So something in that array buffer is doing something encoding wise and I can't figure out how to properly convert text to a buffer. Is there an even more 'raw' way to access the body? I simply want the exact same data to come out somehow

As of now, the request body (when you see it in the handler function) will be corrupt due to the await request.text() call that is occurring. If I get some time I'll try looking at the changes coming in #1436 to see if this was resolved or not.

@DGollings
Copy link
Contributor Author

but assuming it has to be text, could you do something like base64 encode the byte array and then undo it in the correspond arrayBuffer() call?

Didn't immediately think of this, even though its a pretty standard thing to do whenever handling binary data.
Seems like simple fix/patch that could be completely transparent for the user. (encode on receive, decode just before returning).

Is there a technical limitation/reason why this wouldn't be possible?

@timjb
Copy link

timjb commented Nov 26, 2022

Small experience report: I was also having issues with corrupted request data in a PDF file upload mock endpoint (using version 0.46.1). The corruption looked just like the one described in this issue. After upgrading to msw@next there is no corruption anymore :-)

@kettanaito
Copy link
Member

@timjb, happy to hear that!

I will close this issue since there's no timeframe to address it in 1.x. If you're experiencing this, or similar issue, please update to npm i msw@next and follow the #1464 migration guidelines. Thanks.

@kettanaito
Copy link
Member

Released: v2.0.0 🎉

This has been released in v2.0.0!

Make sure to always update to the latest version (npm i msw@latest) to get the newest features and bug fixes.


Predictable release automation by @ossjs/release.

@kettanaito
Copy link
Member

Hey, folks. I tried reproducing this on MSW 2.x and the issue seems to be resolved by relying on the standard Fetch API. I also confirm that the submitted test is now passing. I'm about to merge it. Upgrade to the newest MSW version to have this fixed. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs:triage Issues that have not been investigated yet. scope:browser Related to MSW running in a browser
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants