Skip to content

Add equivalent to hf apply_chat_template() #5527

Closed
@ngxson

Description

@ngxson

Motivation

Add described in #5447 , we can add an equivalent of huggingface's apply_chat_template() that use simple heuristic checks to format the chat into string. In other word, there is no jinja parser being used in our implementation.

Docs for hf's apply_chat_template: https://huggingface.co/docs/transformers/main/en/main_classes/tokenizer#transformers.PreTrainedTokenizer.apply_chat_template

Supported templated

This section is moved to wiki: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template

Initial proposal for llama_chat_apply_template (outdated)
    // used in chat template
    typedef struct llama_chat_message {
        char * role; // NOTE: chatml actually allow roles other than system, user and assistant. therefore, no enum here
        char * content;
    } llama_chat_message;

    /// @details Apply chat template and maybe tokenize it. Inspired by hf apply_chat_template() on python.
    /// @param conversation a list of multiple llama_chat_message
    /// @param template A Jinja template to use for this conversion. If this is nullptr, the model’s default chat template will be used instead.
    /// @param tokenize Whether to tokenize the output. If False, the output will be a string.
    /// @param add_generation_prompt Whether to end the prompt with the token(s) that indicate the start of an assistant message.
    /// @return If "tokenize" is set to false, the "buf" must be a string (returned value will be the string length).
    ///         Otherwise, "buf" must be a list of tokens (returned value will be the number of tokens).
    LLAMA_API int32_t llama_apply_chat_template(
              const struct llama_model * model,
                    llama_chat_message * conversation,
                                size_t   message_count,
                                  char * template,
                                  bool   tokenize,
                                  bool   add_generation_prompt,
                                  char * buf,
                               int32_t   length);

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions