Server: use llama_chat_apply_template to format the chat

Depends on https://github.com/ggerganov/llama.cpp/pull/5538 to be merged

In https://github.com/ggerganov/llama.cpp/pull/5425 , I mentioned that the chat template can be (ideally) detected using model metadata `tokenizer.chat_template`, but at that time, I didn't know that it is possible to access the metadata

Now that we have `llama_chat_apply_template`, we no longer have to worry about metadata. We can use this new function to format the chat supplied to `/v1/chat/completions`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Server: use llama_chat_apply_template to format the chat #5575

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Server: use llama_chat_apply_template to format the chat #5575

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions