-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement a templated endpoint for visibility into chat requests #2333
Conversation
router/src/server.rs
Outdated
if response_format.is_some() && tools.is_some() { | ||
return Err(( | ||
StatusCode::UNPROCESSABLE_ENTITY, | ||
Json(ErrorResponse { | ||
error: "Grammar and tools are mutually exclusive".to_string(), | ||
error_type: "validation".to_string(), | ||
}), | ||
)); | ||
} | ||
|
||
let tool_grammar = match ToolGrammar::apply(tools, tool_choice) { | ||
Ok(grammar) => grammar, | ||
Err(err) => { | ||
metrics::counter!("tgi_request_failure", "err" => "validation").increment(1); | ||
tracing::error!("{}", err); | ||
return Err(( | ||
StatusCode::UNPROCESSABLE_ENTITY, | ||
Json(ErrorResponse { | ||
error: err.to_string(), | ||
error_type: err.error_type().to_string(), | ||
}), | ||
)); | ||
} | ||
}; | ||
|
||
let tools_grammar_prompt = tool_grammar.as_ref().map(|t| { | ||
( | ||
GrammarType::Json(serde_json::json!(t)), | ||
tool_prompt.unwrap_or_default(), | ||
) | ||
}); | ||
|
||
let (tools_grammar_prompt, _grammar) = response_format | ||
.map(|rf| (None, Some(rf))) | ||
.unwrap_or_else(|| { | ||
( | ||
tools_grammar_prompt.clone(), | ||
tools_grammar_prompt.map(|(g, _)| g), | ||
) | ||
}); | ||
|
||
let inputs = match infer.apply_chat_template(messages, tools_grammar_prompt) { | ||
Ok(inputs) => inputs, | ||
Err(err) => { | ||
metrics::counter!("tgi_request_failure", "err" => "validation").increment(1); | ||
tracing::error!("{}", err); | ||
return Err(( | ||
StatusCode::UNPROCESSABLE_ENTITY, | ||
Json(ErrorResponse { | ||
error: err.to_string(), | ||
error_type: err.error_type().to_string(), | ||
}), | ||
)); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is already used in some validation, we should probably extract it into a function (so this doesn't drift from the original implem later).
Also this code is very hard to read for me (lots of options everywhere). I'm sure we can simplify it a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally agree! in the latest commit i've moved this code into a function called prepare_chat_input
which is reusable and much easier to read.
router/src/server.rs
Outdated
.zip(encoding.get_offsets()) | ||
.map(|(&id, &(start, stop))| { | ||
let text: String = | ||
String::from_utf8_lossy(&input.as_bytes()[start..stop]).to_string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, start
and stop
are defined in utf-8 char ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good catch, i've updated this code and the location it was copied from to operate on chars rather than bytes
9121081
to
40658f4
Compare
huggingface#2333) * feat: implement a templated endpoint for visibility into chat requests * feat: improve to tokenize too * fix: adjust return type * feat: simplify prepare_chat_input logic and adjust start stop chars
This PR adds a new
/templated
POST endpoint to allow users to send aChatRequest
and return the templated request. This can help with prompt engineering and for debugging/transparency purposes.