Skip to content

Conversation

parzivale
Copy link
Contributor

@parzivale parzivale commented Sep 28, 2025

The main objective behind this pr is to simplify working with streaming responses from text generation models, however the design should be flexible enough for other applications.

For now the Model and StreamableModel are left to the user to implement for their use case, but the ideal end goal for this pr is to define types for all the I/O interfaces workers-ai uses and then seal Model and StreamableModel.

@guybedford
Copy link
Collaborator

Very interesting approach. It would be great to see some tests for the sorts of workflows you're seeking to enable here, and that would also help for review too.

@parzivale
Copy link
Contributor Author

Sure! should I create a new entry in the test folder for this? would there be a better place to put the tests?

@guybedford
Copy link
Collaborator

Yes, in the tests folder, we have one big test app that is built altogether and tested as one.

@parzivale
Copy link
Contributor Author

There isn't currently an AI binding in the tests wrangler.toml, I can test locally on my account via a new binding but this will break CI right?

@guybedford
Copy link
Collaborator

You should just be able to add a new binding there, if there's issues with that happy to look into it further.

@parzivale
Copy link
Contributor Author

@guybedford I've added new test cases to show a very simple example implementation

Copy link
Collaborator

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, this looks incredible. Thank you for working on this. Just a couple of points about standardizing the input and output options - I know this gets into the deeper bindings questions, but would be good to have a plan at least.

Comment on lines +44 to +46
.run::<Llama4Scout17b16eInstruct>(DefaultTextGenerationInput {
prompt: "What is the answer to life the universe and everything?".to_owned(),
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of DefaultTextGenerationInput, could we implement AiTextGenerationInput directly in core?

Along the lines of -

pub struct AiTextGenerationInput {
    pub prompt: Option<String>,
    pub raw: Option<bool>,
    pub stream: Option<bool>,
    pub max_tokens: Option<u32>,
    pub temperature: Option<f32>,
    pub top_p: Option<f32>,
    pub top_k: Option<u32>,
    pub seed: Option<u32>,
    pub repetition_penalty: Option<f32>,
    pub frequency_penalty: Option<f32>,
    pub presence_penalty: Option<f32>,
    pub messages: Option<Vec<RoleScopedChatInput>>,
    pub response_format: Option<AiTextGenerationResponseFormat>,
    pub tools: Option<serde_json::Value>, // For flexible union type
    pub functions: Option<Vec<AiTextGenerationFunctionsInput>>,
}

We don't even need all fields initially.

And similarly for https://workers-types.pages.dev/#AiTextGenerationOutput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! that works. was trying to match the type definitions emitted by wrangler types, but that does seem more legible. Also would it make sense to mark the fields as private and use a builder pattern to construct the input?

Copy link
Contributor Author

@parzivale parzivale Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for tools I have a seperate pr I would like to make which is based on this one which handles tools as a new sub trait of model, the gist of it is you define tools like the following

type ToolRun = Rc<
    dyn Fn(
        &Env,
        Option<serde_json::Map<String, serde_json::Value>>,
    ) -> Pin<Box<dyn Future<Output = Result<String, Error>>>>,
>;

#[derive(Clone)]
pub struct Tool {
    pub name: &'static str,
    pub description: &'static str,
    pub run: ToolRun,
}

and then would would call them from ai with the following method

impl Ai {
  pub async fn run_with_tools<M: ToolsModel>(&self, input: &M::Input, tools: &[Tool]) -> Result<M::Ouput, Error>  {
  ...
  }
}

this would require a new trait called ToolsModel like this

pub trait ToolsModel: Model {}

The problem is I'm not sure this approach is generic enough to work for everyone. For instance I know normally working with axum you are expected to wrap env in an Arc which mean you no loner have access to this functionality

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as possible it would be good to follow Wasm bindgen semantics. I know we wrap all types currently, but I'm trying to move to a model where when possible we should use worker-sys types as the high level types.

So the wasm-bindgen pattern would be to define an imported type, and define all properties as getters and setters:

#[wasm_bindgen]
extern "C" {
    # [wasm_bindgen (extends = :: js_sys :: Object , js_name = AiTextGenerationInput)]
    #[derive(Debug, Clone, PartialEq, Eq)]
    #[doc = "Ai Text Generation"]
    pub type FetchEventInit;
    #[doc = "Prompt for generation"]
    #[wasm_bindgen(method, setter = "prompt")]
    pub fn set_prompt(this: &AiTextgenerationInput, prompt: Option<String>);
    #[wasm_bindgen(method, getter = "prompt")]
    pub fn get_prompt(this: &AiTextgenerationInput) -> Option<String>;
}

Then functions would just take &worker-sys::AiTextGenerationInput input directly. And we'd likely alias it as worker::ai::AiTextGenerationInput.

Copy link
Contributor Author

@parzivale parzivale Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! gotcha! I didn't realise you were referring to types in js, thanks for clarifying

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked into AiTextGeneration in the docs you provided and it seems AiTextGeneration is a typescript type not a class, so there is no imported type that we can bind to in the js namespace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fine - it's a structural type not a formal import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I just saw your comment now, I just made a commit where I kept the AiTextGenerationInput struct in worker rather than binding to js, if you don't like the api I can change it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly trying to formalize our conventions, so let me think about this some more and follow-up soon. My initial sense is that serde perf is equivalent so we should prefer the direct interpretation convention, but I want to ensure I've got that right.


use crate::SomeSharedData;

pub struct Llama4Scout17b16eInstruct;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this seems like the right approach!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants