Skip to content

Add a passive stream read handler for input-stream #31

Open
@autodidaddict

Description

@autodidaddict

The current methods for accessing a stream line up nicely with our current mental models for how streams work in "regular" programming environments and languages like Rust, Java, etc. You can execute a read method that will pull some number of bytes off of the stream. Executing read in a tight loop will churn through the entire contents of the stream.

The issue that comes up in wasm components that doesn't arise in "normal" code is when we attempt to do either of the following:

  • Process a single chunk of bytes bigger than what the host runtime sandbox allows us to allocate
  • Perform processing that can take long enough to exceed a per-call timeout enforced by the host.

I was looking into how to support these types of behaviors for the blob store specification and noticed that the current wit files for that use a wasi-io input-stream. The trick is in how to allow processing very large files, which are fairly common when blob stores are being used.

I propose that we add a passive handler. Rather than the component reading an explicit number of bytes with a synchronous call like read, we should be able to start a background streaming of the blob. This would allow the host to determine how much of the blob should be delivered to the component and how fast.

We could add a function to begin chunking to the io interface:

start-chunking-read: func(this: input-stream, chunk-size: u64, correlation-id: u64) -> result<_, stream-error>

This just works when there is no error, and the component then finishes its current execution context. At some time later determined by the host (by virtue of the provider of the blob store capability), the host starts making repeated calls to send individual chunks to the component. This would work essentially like the inverse of subscribe-to-input-stream, using the following interface:

interface chunked-stream-handler {
 handle-chunk: func(
     stream: input-stream, 
     correlation-id: u64,
     current-chunk: u64, 
     total-chunks: u64, 
     data: list<u8>) -> bool
}

The component can return true to let the host know to continue delivering chunks or false to cancel the chunking stream.

⚠️ disclaimer: if this type of functionality is already available in the spec and I've read things wrong, feel free to dismiss this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions