Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: an interface for modeling long-running processes for HTTP serving #95

Open
Mossaka opened this issue Jan 5, 2024 · 20 comments

Comments

@Mossaka
Copy link
Collaborator

Mossaka commented Jan 5, 2024

Hi there,

I'd like to bring attention to an area of ambiguity in the wasi:http specification concerning runtime behavior for incoming requests. Currently, implementations like wasmtime serve reinitialize the wasm Store for each invocation of the incoming-handler, effectively treating wasi:http as a stateless, serverless framework akin to Lambda/Azure functions. This approach leverages the benefits of small wasm module size and quick startup times. However, the specification does not explicitly address an alternative scenario where the wasm module acts as a long-running process, maintaining multiple sockets in memory. This approach offers its own set of tradeoffs, such as

  • amortizing costs over frequent external network calls, similar to a database connection pool.
  • sharing states across requests
  • more efficient use of system resources by avoiding frequent spinning up/tearing down instances.
  • monitoring

An example of this implementation can be seen in @brendanburns's work with wasi-go.

The lack of explicit guidance in the spec could lead to divergent runtime assumptions and decisions, potentially confusing developers. To illustrate, consider this HTTP handler code in Go:

package main

import (
    "fmt"
    "net/http"

    "github.com/dev-wasm/dev-wasm-go/http/server/handler"
)

var count = 0

func init() {
    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        count++
        w.WriteHeader(200)
        w.Write([]byte(fmt.Sprintf("Hello from WASM! (%d)", count)))
    })
    handler.ListenAndServe(nil)
}

func main() {}

In the wasmtime serve implementation, the count variable always prints 1 due to the creation of a new wasm instance for each request, preventing the sharing of local variables across requests. This behavior may differ in other implementations, as the spec does not explicitly define these semantics.

I propose that we consider the potential benefits of a wasi:http world that models long-running, container-like wasm modules. If this seems valuable, I would suggest introducing wasi:http/stateful-proxy to represent this concept. Correspondingly, to maintain clarity, wasi:http/proxy could be renamed to wasi:http/serverless-proxy.

@lukewagner
Copy link
Member

lukewagner commented Jan 8, 2024

The current wording does indeed allow a host to reuse a single instance to handle more than 1 HTTP request (here). (With Preview 3 and native async support, there could even be multiple requests handled concurrently by the same instance.) Thus, it would be fine for wasmtime's serve command to reuse instances or gain a command-line arg to do that.

In general, this gives hosts a fair amount of freedom to maintain a reused pool of instances of any size (matching the usual execution model of auto-scaled workloads).

@Mossaka
Copy link
Collaborator Author

Mossaka commented Jan 8, 2024

Thank you for the clarification, @lukewagner. The flexibility in the current specification regarding the reuse of component instances is indeed valuable and it addresses my original question "specification for host's expectation". However, I'd like to emphasize a crucial aspect from the developer's perspective that seems to be overlooked.

As demonstrated in my initial code snippet, the ambiguity surrounding the reuse of local variables (like a counter) can lead to significant confusion and frustration for developers. In the current setup, it's unclear whether a local variable will persist across multiple HTTP requests or not in dev phase. This uncertainty can lead to unexpected behaviors, especially after N number of calls, which could impact the reliability and predictability of the application.

What I am wishing for is an interface that addresses this issue directly, an interface that would offer developers a guaranteed long-running process (container-like) environment for their component instnaces, which would open up possibilities for developing local state-dependent applications. This gives developers a more predicable and familiar development semantics. The wasi:http/stateful-proxy is my idea to address it, but I am open to other suggestions.

@Mossaka Mossaka changed the title Spec for host's expectation of long-runing vs. serverless wasm instances Proposal: an interface for modeling long-running processes for HTTP serving Jan 8, 2024
@acfoltzer
Copy link
Contributor

acfoltzer commented Jan 10, 2024

@Mossaka do you imagine there being a difference to the contents of stateful-proxy vs serverless-proxy, or would it just be differently named in order to set the expectations for the user?

One thing I'd point out is that even if the intent of an embedder is to provide a long-lived instance for handling multiple requests, the guest code still needs to be able to handle starting over from scratch when the embedder spins up a new instance. How often that occurs is really what I think the root of your question is about, and I'm not sure that we can or would want to express that in WIT.

@Mossaka
Copy link
Collaborator Author

Mossaka commented Jan 10, 2024

do you imagine there being a difference to the contents of stateful-proxy vs serverless-proxy

A tricky question is what granularity do we want for scenarios like this. If we want a full long-running process, one could argue that what we are looking for is wasi:cli/imports which has sockets APIs to build a connection pool + wasi:http/incoming-handler + wasi:http/outgoing-handler.

How often that occurs is really what I think the root of your question is about

Agreed

@lukewagner
Copy link
Member

In the current setup, it's unclear whether a local variable will persist across multiple HTTP requests or not in dev phase.

Just as a nit, I'd suggest that it is clear, but it's clear that the developer must not depend on the same global state being reused across requests. It might, but you must assume it isn't always.

What I am wishing for is an interface that addresses this issue directly, an interface that would offer developers a guaranteed long-running process (container-like) environment for their component instnaces, which would open up possibilities for developing local state-dependent applications.

This is where things get a bit confusing for me because my understanding is that the usual way containers are deployed in a mainstream orchestrator like Kubernetes or Nomad is that the containers are auto-scaled up and down and thus, if you are implementing an HTTP proxy-like container, you also must not rely on seeing the same global state. Instead, I think the common practice is to put global state in some sort of durable datastore. This has the added benefit of ensuring that the state survives a crash (hard or soft) which is probably something you want anyways if you're caring about it being global/shared. Thus, my proposal would be that, when you are implementing a service and global state matters, use wasi:keyvalue ;-)

That being said, I know there are features (like StatefulSets in Kubernetes) that allow you to have a singleton container instance and, iiuc, this is what you'd need to use when running a SQL database as a container. However, I think this is a concept that only starts to make sense at the higher level of a cluster. In particular, if we did try to specify a stateful-proxy world in WASI, the natural and difficult-to-answer question is: what is the "scope" of this stateful proxy? This is a really important question b/c inevitably the scope isn't "the Earth", and thus there will in fact be multiple instances (in CI/CD, in different Availability Zones, in Canary or Blue/Green deploys, etc). K8s lets you say "the cluster is the scope", but that's just one answer and not one that is so general that we can push it down into WASI. But that's based on my admittedly-limited understanding of this space, so I'm happy to learn and discuss more.

@brendandburns
Copy link
Contributor

My main concern here is around portability of wasm bundles for different implementations.

As you say it is clear that a developer must not depend global state being reused for correct behavior. However, there are a lot of runtime behaviors which are not about correctness, but rather performance and surprises.

If I implement wasi-http in a runtime where I keep a warm pool of modules where new wasm runtimes are only created when load increases and only discarded when load decreases, then any initialization that the wasm module does (e.g. loading files into cache, connecting to a database, whatever) gets amortized across a large number of requests. If a developer builds a performant application for my runtime, they will come to depend on this amortization to meet their performance needs. When they then move their application to a different runtime (e.g. the current wasmtime runtime) the performance will suffer greatly because the initialization will be performed for every request.

In this case, neither implementation is wrong, and both run "correctly" but the user is quite surprised that their notion of portability of wasm/wasi is violated.

Similar things can happen if I use a library that maintains histograms of request latency (e.g. prometheus metrics). If I run that code in a wasi-http implementation which generally tries to re-use wasm runtimes, I will get reasonable histograms across many requests. If I run the same module in wasmtime today I will only ever see histograms of size 1.

The absence of clarity in the spec will lead to differences in runtime implementations, and those differences will cause pain for developers who are looking to wasm/wasi for portability (which is the main point imho)

@lukewagner
Copy link
Member

That's a great point! Switching perspectives from semantic guarantees to cost-model expectations, I agree that the current situation could naturally lead to an implicit dependency on instance reuse that would meaningfully break portability in practice. So yes, I'm interested to solve this problem.

As a bit of background, WIT world names don't appear anywhere in compiled components, only the names of individually-imported or exported interfaces (this is by design and is what makes worlds set-like, union-able and duck-typed), thus if we want to allow a component to statically declare its expectation (of best-effort reuse) to the host, we need to signal this fact by an imported and/or exported interface name.

My first idea for how this might look in a Preview 2.x timeframe is that we could define a new wasi:http/slow-init interface (containing a single init function) that is exported (alongside wasi:http/incoming-handler) by a new wasi:http/slow-init-proxy world (that includes wasi:http/proxy). And then the idea is that when you export wasi:http/slow-init, you do all your expensive setup in the init function and then you assume that your component instance is reused (via repeated calls to incoming-handler.handle) as much as possible. And then a fancy implementation can optimize cold-start latency by knowing to call the init function eagerly (before a request arrives) or maintain a pool of init-already-called instances, etc.

How does that sound to folks?

@PiotrSikora
Copy link
Collaborator

stateful-proxy, as suggested here, would require all traffic to be handled by a single process that runs forever, which isn't scalable and doesn't work at all outside of simple proof of concepts and/or toy projects.

It also leaks details about the deployment model and runtime environment, which works against the portability and host abstraction advantages of Wasm and WASI.

As @lukewagner already mentioned, the proper way to persist state across requests and/or instances is via key-value store (this is how this is addressed in various Proxy-Wasm implementations).

This way, the same code can be deployed in distinct environments (e.g. in-process with local KV or serverless with distributed KV) without any changes.

My first idea for how this might look in a Preview 2.x timeframe is that we could define a new wasi:http/slow-init interface (containing a single init function) that is exported (alongside wasi:http/incoming-handler) by a new wasi:http/slow-init-proxy world (that includes wasi:http/proxy). And then the idea is that when you export wasi:http/slow-init, you do all your expensive setup in the init function and then you assume that your component instance is reused (via repeated calls to incoming-handler.handle) as much as possible. And then a fancy implementation can optimize cold-start latency by knowing to call the init function eagerly (before a request arrives) or maintain a pool of init-already-called instances, etc.

This makes sense (we use proxy_on_plugin_configuration that executed at init in Proxy-Wasm for this purpose).

However, there is nothing HTTP-specific about init function, so IMHO this shouldn't be part of wasi-http, but something that could be reused in other WASI proposals.

@lukewagner
Copy link
Member

That's a good point regarding wasi:http/init not being HTTP-specific and I had wondered about the same; I just wasn't sure what to name it, but yeah, makes sense.

@brendandburns
Copy link
Contributor

I think that having an explicit init is a good step towards clarifying the lifecycle. However I do also want to make it clear that just having the init doesn't provide any guidelines for the runtime behavior.

e.g. wasmtime serve could still call init and handle and then discard the whole module w/o reuse, some other implementation could call init once for 100 calls to handle that's going to be very different from a performance perspective.

I think that we want to give some guidelines around expected lifecycle of the implementation or else different implementations of wasi-http are going to have vastly different performance characteristics.

@PiotrSikora
Copy link
Collaborator

I think that we want to give some guidelines around expected lifecycle of the implementation or else different implementations of wasi-http are going to have vastly different performance characteristics.

Why would you limit wasi-http like that? WASI should describe interfaces, not implementations.

Also, different implementations will have vastly different performance characteristics (sometimes orders of magnitude!) depending on the deployment model and/or environment anyway (e.g. in-process vs sandboxed process vs serverless).

@lukewagner
Copy link
Member

My thinking was that the spec text for the init function would non-normatively suggest that implementations SHOULD attempt to reuse instances when possible when the init function was exported (while still allowing for shutdown of cold instances or init of multiple instances under load), and thus the default implementation of wasmtime serve, when given a component that exports init, would be to call init only once and repeatedly call handle on the same instance.

@ydnar
Copy link

ydnar commented Jan 16, 2024

Is it safe to assume that _initialize will be called once prior to any wasi:http reactor functions?

@lukewagner
Copy link
Member

Yes. It wouldn't be hard-enforced by the underlying component model machinery, but I think it would be part of the specified contract of the WASI interface (saying that if the caller did in fact call any other export before init, the callee can trap).

@ydnar
Copy link

ydnar commented Jan 16, 2024

OK, great. For interpreted languages, or languages with a runtime (e.g. Go), it’d be nice to have an explicit contract that the host calls _initialize before any further calls (e.g. wasi:cli/command#run) or wasi:http proxy handlers. That way the runtime can be initialized, the program can be compiled, and any state separate from requests can be set up.

Seems like reuse of an instance is orthogonal to this?

@Mossaka
Copy link
Collaborator Author

Mossaka commented Jan 17, 2024

+1 to the init function which provides more clarity on modeling long-running wasm processes.

@lukewagner
Copy link
Member

If we merge component-model/#297 (which seems likely), then a component's built-in start function can serve as the init function, being able to call arbitrary imports to set up the instance state.

Returning to the question of: when should wasmtime serve reuse instances: after some more thought on a plane today, it seems like serve should by default reuse instances (with command-line arguments to modify this behavior to dial up or down reuse). The reason is that this mirrors the fact that component instances are by-default reused by client or parent components, treating wasmtime serve as a parent component (which in theory it could be implemented as). In the future, once we add a built-in for runtime instantiate, components that wish to enforce an instance-per-export-call isolation invariant would be able to implement this as an internal impl detail of the component, which is altogether better since it semantically forces the non-reuse, establishing a clear portable cost model.

If we made both these changes, then I think there should be no need for any new WASI interface.

@brendandburns
Copy link
Contributor

I agree that the combination of an start function to the component model as well as favoring reuse means that changes to the wasi-http spec are unnecessary.

However, we should definitely document this somewhere in the spec so that implementors know what is expected.

@lukewagner
Copy link
Member

Sorry for the long silence; I was background-pondering this and asking folks if we should indeed simply just change the wasmtime serve default and processing their feedback. Based on all that, a newer, more-nuanced idea is posted in the Component Model repo here for feedback. My hope is that it addresses the root question here.

@ydnar
Copy link

ydnar commented Nov 3, 2024

Memorializing a conversation at the Plumber's Summit:

A wasi-http component can indicate it shouldn't be reused by exiting, e.g. something like wasi:cli/exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants