Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Miniflare 3] ✨ Implement *magic* proxy and add back support for Miniflare#get*() methods #639

Merged
merged 14 commits into from
Aug 17, 2023

Conversation

mrbbot
Copy link
Contributor

@mrbbot mrbbot commented Jul 25, 2023

Hey! 👋 This PR adds back support for Miniflare 2's Miniflare#get*() methods for accessing bindings from outside the Workers sandbox. It does this by implementing a proxy system, that makes HTTP requests to workerd on property access/function calls. Essentially the aim here is to support...

import { Miniflare } from "miniflare";
const mf = new Miniflare(...);
const ns = await mf.getDurableObjectNamespace("OBJECT");
const id = ns.idFromName("thing");
console.log(id.toString());
const stub = ns.get(id);
const res = await stub.fetch("http://placeholder/");
console.log(await res.text());

...without re-implementing any of the APIs in Node.js. Note the above snippet uses a mix of synchronous and asynchronous calls, and uses some native objects as arguments to other native functions.

Very cool projects such as https://github.com/leader22/cfw-bindings-wrangler-bridge and https://github.com/james-elicx/cf-bindings-proxy already exist and do similar things. This PR aims to improve on those solutions by:

  • Supporting synchronous methods like those on DurableObjectNamespace
  • Supporting passing native return values from one method as parameters of another
  • Avoiding re-declaring the existing API (we're still doing this with types atm, but should be able to avoid that too)
  • Avoiding buffering ReadableStream arguments and return values where possible
  • Being general enough to avoid special casing APIs where possible

Note this solution relies on a persistent Durable Object that you probably wouldn't want to run in remote preview. The approaches taken by the projects above are likely more appropriate there.


High Level Implementation Overview

The proxy client is implemented in src/plugins/core/proxy/. The server is implemented as a Durable Object in src/workers/core/proxy.worker.ts.

Using a Durable Object allows us to share references containing I/O objects like streams across requests. The Durable Object contains a "heap", mapping addresses to references. The "heap" is initialised with globalThis and env.

The proxy client builds proxies for each object on the heap. Accessing a property on a proxy will result in a synchronous GET operation to the proxy server. If the property is not a method, the value will be serialised back to the client. Note this may involve putting more references on the "heap".

If the property is a method, the client will return a function from the access. Future accesses to the same property key will return the same function without the synchronous GET operation. Calling this function will serialise all the arguments, then perform a synchronous CALL operation on the target, and the return value will be serialised back to the client. If this function returns a Promise, it will be stored on the "heap", and a reference returned. An asynchronous GET operation will then be performed to resolve the Promise, and serialise the result. If a function returns a Promise once, all future calls will send asynchronous CALL operations instead, that resolve the Promise without an additional round trip.

If the function call had ReadableStream arguments, the first will be sent unbuffered after the rest of the arguments. All function calls with ReadableStream or Blob arguments are assumed to be asynchronous. This assumption is required as synchronous operations block the main thread and prevent chunks from ReadableStreams being read.

If the function call threw, or returned a Promise that rejected, the error will be serialised and re-thrown/rejected in the client. Note that the stack trace will be updated to reflect the calling location in the client, not the server.

To prevent unbounded growth of the "heap", all proxies are registered with a FinalizationRegistry. When the proxy is garbage collected, a FREE operation will remove the corresponding "heap" entry allowing it to be garbage collected on the server too.

When workerd is restarted with Miniflare#setOptions() or stopped with Miniflare#dispose(), all proxies are poisoned. Once a proxy is poisoned, it cannot be used, and must be recreated. Poisoned proxies are unregistered from the FinalizationRegistry too, as the addresses they point too will be invalid and shouldn't be freed again.

@changeset-bot
Copy link

changeset-bot bot commented Jul 25, 2023

⚠️ No Changeset found

Latest commit: 62c842e

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@mrbbot mrbbot changed the title ✨ Implement *magic* proxy and add back support for Miniflare#get*() methods [Miniflare 3] ✨ Implement *magic* proxy and add back support for Miniflare#get*() methods Jul 25, 2023
@leaysgur
Copy link
Contributor

leaysgur commented Jul 26, 2023

Hi, I'd like to thank you for the great work as always! 😹

The bullet points are exactly what I was struggling with and had pretty much given up on...

Anyway, I'd like to ask about the usage.

If I want to run it as a mock module just for local development, would the following be the minimum setup?

const mf = new Miniflare({
  modules: true,
  script: "",
  kvNamespaces: ["MY_KV"],
});
const bindings = await mf.getBindings();

// e.g. SvelteKit usage
event.platform = {
  env: { ...bindings, },
};

// later somewhere...where?
mf.dispose();

@mrbbot
Copy link
Contributor Author

mrbbot commented Jul 26, 2023

@leader22, yep! You may need to specify export default { fetch() { return new Response(null, { status: 404 }) } } as the script, as I think workerd will complain no event handlers are registered otherwise. I'm considering whether we want to make script optional and default to something that just 404s to make this use case even simpler. 🤔

@leaysgur
Copy link
Contributor

leaysgur commented Jul 26, 2023

I thought so too at first, but at least for now, it seems working. 🙈

  • script: ""
  • script: "", modules: true

make script optional

It looks good to me.

Or something like createForMockProxyOrSomeNiceName(opts): { getBindings, getXxx, ... }?
When trying to use this magic proxy, I don't think there is any reason to interact with mf instance itself.


Oh sorry, I overlooked about dispose()! Forget helpers above.

If no worries, making script optional feels more natural since it follows miniflare's concept(simple but flexible).

@mrbbot mrbbot force-pushed the bcoll/tre-magic-proxy branch 4 times, most recently from 8ac3453 to 96e7fde Compare July 26, 2023 11:44
@mrbbot mrbbot requested a review from a team July 26, 2023 12:59
@mrbbot mrbbot marked this pull request as ready for review July 26, 2023 12:59
Copy link
Contributor

@RamIdeas RamIdeas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Great work!

packages/miniflare/src/plugins/core/proxy/client.ts Outdated Show resolved Hide resolved
Comment on lines +172 to +180
t.regex(e.stack, /syncUserFunction/);
t.notRegex(e.stack, /ProxyStubHandler/);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test framework have inline snapshots? Would be useful here instead of assuming implementation details like ProxyStubHandlers name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AVA has snapshot support, but I don't think it has support for inline snapshots. I admit this isn't ideal though, maybe when/if we port these tests to Vitest as part of the migration to workers-sdk, we can add them then?

const res = await mf.dispatchFetch("http://localhost");
t.true(res.ok);
await res.arrayBuffer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially confusing if you don't assert here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments to clarify we're draining here. This is to make sure we consume response bodies. MIniflare will throw an uncaught exception if the MINIFLARE_ASSERT_BODIES_CONSUMED environment variable is set to true (default in tests), and the body isn't consumed in the same tick as the Response is returned. This ensures we don't have any dangling undici bodies, which may cause hangs/socket failures.

Copy link
Contributor

@RamIdeas RamIdeas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 just a few minor comments. Great work!

@mrbbot mrbbot force-pushed the bcoll/tre-magic-proxy branch 2 times, most recently from 681ff91 to da9e434 Compare August 14, 2023 12:58
@mrbbot
Copy link
Contributor Author

mrbbot commented Aug 14, 2023

(last force push is a rebase, one before is applying suggestions)

This provides the foundations for adding back `Miniflare#get*()`
methods. It implements a proxy system, that makes HTTP requests to
`workerd` on property access/function calls. Synchronous access/calls
are supported too.
Adds back support for `Miniflare#getBindings()` from Miniflare 2.
This function returns a copy of the `env` object passed to module
workers in Node.js. This could be used with framework dev servers to
populate their "platform" objects.
Adds back support for `Miniflare#getCaches()`, `getD1Database()`,
`getDurableObjectNamespace()`, `getKVNamespace()`,
`getQueueProducer()`, and `getR2Bucket()` from Miniflare 2. Tests
will follow in future commits.
Cache tests previously used custom HTTP endpoints to put/match/delete
cached data. With the new `Miniflare#getCaches()` function, we can
simplify the tests to use the exact same Cache API as workers.
D1 tests previously defined a `TestD1Database` stub that made calls
to custom HTTP endpoints to perform operations. With the new
`Miniflare#getD1Database()` function, we can simplify the unshimmed
test using the wrapped binding. Note we still need to use
`TestD1Database` for the shimmed tests, as the proxy will only give
us a `Fetcher` binding.
Tests the synchronous Durable Object ID creation methods, and
performing a WebSocket upgrade on a proxied Durable Object stub.
R2 tests previously defined a bunch of `Test*` stubs that made
requests to custom HTTP endpoints and used custom JSON replacers/
revivers to encode R2 objects. With the new `Miniflare#getR2Bucket()`
function, we can delete a lot of this code, and just use the proxy.
Adds a few more tests for proxy edge cases, like WebSocket upgrades
with service bindings and calling with multiple stream/blob
arguments.
...to try and improve test stability. This isn't ideal, but hopefully
it should fix some of the flakiness we're seeing. We can investigate
this further later.
This change throws an uncaught exception if we forget to consume a
response body in a test. Unconsumed bodies may cause `undici` to hang
or throw socket errors.
Declares a `ReplaceWorkersTypes` type instead of manually declaring
types for each of the Workers APIs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants