Skip to content

Server: possibility of customizable chat template? #5922

Closed
@ngxson

Description

@ngxson

Motivation

While we already have support for known chat templates, it sometimes not enough for users who want to:

  • Use their own fine tuned model
  • Or, use a model that does not have Jinja template

The problem is that other implementations of chat template out there are also quite messy, for example:

  • Jinja tempate: as discussed in server : improvements and maintenance #4216 , it's too complicated to add a such parser into the code base of llama.cpp
  • The format of ollama requires a parser, and it's not very flexible for future usages
  • LM Studio format does not requires parser, but lack support for multi roles (we currently have system - user - assistant, but technically it's possible to have custom roles like database, function, search-engine,...)

Possible implementation

My idea is to have a simple JSON format that take into account all roles:

{
  "system": {
    "prefix": "<|system|>\n",
    "postfix": "<|end|>\n"
  },
  "user": {
    "prefix": "<|user|>\n",
    "postfix": "<|end|>\n"
  },
  "assistant": {
    "prefix": "<|assistant|>\n",
    "postfix": "<|end|>\n"
  },
  "_stop": ["<|end|>"],
  "_generation": "<|assistant|>\n",
}

User can specify the custom template via --chat-template-file ./my_template.json

The cpp code will be as simple as:

std::string apply_custom_template(json messages, json tmpl) {
  std::stringstream ss;
  for (auto & msg : messages) {
    json t = tmpl[msg["role"]];
    ss << t["prefix"] << msg["content"] << t["postfix"];
  }
  ss << tmpl["_generation"]; // add generation prompt
  return ss.str();
}

NOTE: This function does not take into account models that does not support system prompt for now, but this function can be added in the future, maybe toggle via an attribute inside json "system_inside_user_message": true

Ref:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions