-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: Retrieve prompt template in /props #8337
Conversation
This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function.
I'm not very confident about this change. Even when frontend can get the jinja template, how it will be used? I suppose it requires a jinja parser on frontend? Another way I can think of is that backend can format a chat template with some pre-defined placeholders, like |
Yes. Adding a real Jinja parser in c++ has already been discussed I think. It's not likely to be added. But lots of Jinja exist for Python and JavaScript. The |
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
* server: Retrieve prompt template in /props This PR adds the following: - Expose the model's Jinja2 prompt template from the model in the /props endpoint. - Change log-level from Error to Warning for warning about template mismatch. The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it. Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in "llm_load_print_meta" function. * Make string buffer dynamic * Add doc and better string handling * Using chat_template naming convention * Use intermediate vector for string assignment
This PR adds the following:
The front-end stands a better chance of actually executing the Jinja template format correctly. Server is currently just guessing it.
Ideally this should have been inside a JSON block that expose the same key/value pairs as listed during startup in
llm_load_print_meta
function, allowing front-end to read a plethora of model properties.