Skip to content

server: Use llama_chat_apply_template on /completion endpoint #6624

Closed
@EZForever

Description

@EZForever

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

Use llama_chat_apply_template on /completion or a new endpoint (e.g. /chat), in addition to the current OpenAI compatibility endpoints. Update WebUI to reflect the change.

Motivation

The OpenAI compatibility endpoints are nice and all, but native endpoints offer functionalities specific to llama.cpp (e.g. mirostat, slot management, etc). One exception is automatically applying chat templates, which has been introduced to OpenAI compatibility endpoints in #5593, while the native endpoint (/completion) still uses the old prompt/antiprompt formatting method and requires the user to provide correctly formatted prompts. This is especially a problem for WebUI users, and there has been many issue or discussion threads about worse-than-expected chat results due to incorrect templates. It would thus be great to introduce server-side support for chat templates on native endpoints.

Related: #5447

Refs on antiprompts being old and obselete: #6378 (review) #6391 (comment)

Possible Implementation

/completion works well as a text completion endpoint (?), thus in order not to break stuffs too much, maybe we can consider adding a new endpoint (/chat) with the changes. WebUI chat page should use the new endpoint instead. "Prompt template" and "Chat history template" options are thus obselete, and could be removed or moved under "More options".

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions