Skip to content

Llama.cpp server destroys <|eot_id|> token even midway through prompt! #6793

Closed
@araleza

Description

@araleza

In ./server, trying to correctly use Continuation mode with Llama 3 70B is not possible, as the correct prompt template cannot be entered. This is because the token <|eot_id|> becomes zero tokens, even when it occurs midway through the prompt:

image

(In the above image, I hit start and looked at the number of tokens cached minus the number of tokens predicted: 402 - 400 = 2. This value is the number of tokens I typed as my prompt. Llama's result shown is 2 where it should be 3. I deleted the generated tokens before taking this screenshot, to show what I originally typed)

This token is required multiple times by the prompt template, which looks like this:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

[system prompt goes here]<|eot_id|><|start_header_id|>user<|end_header_id|>

[user prompt goes here]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

[ai response will go here]

Not adhering to the prompt usually decreases the ability of the LLM.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions