-
Notifications
You must be signed in to change notification settings - Fork 347
feat: create trait definitions for model and streamable model #833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Very interesting approach. It would be great to see some tests for the sorts of workflows you're seeking to enable here, and that would also help for review too. |
Sure! should I create a new entry in the test folder for this? would there be a better place to put the tests? |
Yes, in the tests folder, we have one big test app that is built altogether and tested as one. |
There isn't currently an AI binding in the tests wrangler.toml, I can test locally on my account via a new binding but this will break CI right? |
You should just be able to add a new binding there, if there's issues with that happy to look into it further. |
@guybedford I've added new test cases to show a very simple example implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work, this looks incredible. Thank you for working on this. Just a couple of points about standardizing the input and output options - I know this gets into the deeper bindings questions, but would be good to have a plan at least.
.run::<Llama4Scout17b16eInstruct>(DefaultTextGenerationInput { | ||
prompt: "What is the answer to life the universe and everything?".to_owned(), | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of DefaultTextGenerationInput
, could we implement AiTextGenerationInput
directly in core?
Along the lines of -
pub struct AiTextGenerationInput {
pub prompt: Option<String>,
pub raw: Option<bool>,
pub stream: Option<bool>,
pub max_tokens: Option<u32>,
pub temperature: Option<f32>,
pub top_p: Option<f32>,
pub top_k: Option<u32>,
pub seed: Option<u32>,
pub repetition_penalty: Option<f32>,
pub frequency_penalty: Option<f32>,
pub presence_penalty: Option<f32>,
pub messages: Option<Vec<RoleScopedChatInput>>,
pub response_format: Option<AiTextGenerationResponseFormat>,
pub tools: Option<serde_json::Value>, // For flexible union type
pub functions: Option<Vec<AiTextGenerationFunctionsInput>>,
}
We don't even need all fields initially.
And similarly for https://workers-types.pages.dev/#AiTextGenerationOutput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! that works. was trying to match the type definitions emitted by wrangler types, but that does seem more legible. Also would it make sense to mark the fields as private and use a builder pattern to construct the input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for tools I have a seperate pr I would like to make which is based on this one which handles tools as a new sub trait of model, the gist of it is you define tools like the following
type ToolRun = Rc<
dyn Fn(
&Env,
Option<serde_json::Map<String, serde_json::Value>>,
) -> Pin<Box<dyn Future<Output = Result<String, Error>>>>,
>;
#[derive(Clone)]
pub struct Tool {
pub name: &'static str,
pub description: &'static str,
pub run: ToolRun,
}
and then would would call them from ai with the following method
impl Ai {
pub async fn run_with_tools<M: ToolsModel>(&self, input: &M::Input, tools: &[Tool]) -> Result<M::Ouput, Error> {
...
}
}
this would require a new trait called ToolsModel like this
pub trait ToolsModel: Model {}
The problem is I'm not sure this approach is generic enough to work for everyone. For instance I know normally working with axum you are expected to wrap env in an Arc which mean you no loner have access to this functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As much as possible it would be good to follow Wasm bindgen semantics. I know we wrap all types currently, but I'm trying to move to a model where when possible we should use worker-sys types as the high level types.
So the wasm-bindgen pattern would be to define an imported type, and define all properties as getters and setters:
#[wasm_bindgen]
extern "C" {
# [wasm_bindgen (extends = :: js_sys :: Object , js_name = AiTextGenerationInput)]
#[derive(Debug, Clone, PartialEq, Eq)]
#[doc = "Ai Text Generation"]
pub type FetchEventInit;
#[doc = "Prompt for generation"]
#[wasm_bindgen(method, setter = "prompt")]
pub fn set_prompt(this: &AiTextgenerationInput, prompt: Option<String>);
#[wasm_bindgen(method, getter = "prompt")]
pub fn get_prompt(this: &AiTextgenerationInput) -> Option<String>;
}
Then functions would just take &worker-sys::AiTextGenerationInput
input directly. And we'd likely alias it as worker::ai::AiTextGenerationInput
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! gotcha! I didn't realise you were referring to types in js, thanks for clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked into AiTextGeneration in the docs you provided and it seems AiTextGeneration is a typescript type not a class, so there is no imported type that we can bind to in the js namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be fine - it's a structural type not a formal import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I just saw your comment now, I just made a commit where I kept the AiTextGenerationInput struct in worker rather than binding to js, if you don't like the api I can change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm strongly trying to formalize our conventions, so let me think about this some more and follow-up soon. My initial sense is that serde perf is equivalent so we should prefer the direct interpretation convention, but I want to ensure I've got that right.
|
||
use crate::SomeSharedData; | ||
|
||
pub struct Llama4Scout17b16eInstruct; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this seems like the right approach!
The main objective behind this pr is to simplify working with streaming responses from text generation models, however the design should be flexible enough for other applications.
For now the Model and StreamableModel are left to the user to implement for their use case, but the ideal end goal for this pr is to define types for all the I/O interfaces workers-ai uses and then seal Model and StreamableModel.