Vicuna 1.1 / special_tokens_map.json support

As far as I can tell, Vicuna 1.1 uses `</s>` as the separator for dialogue responses. It's tokenized as the EOS token (2) when I tried it in a Python script using Transformers, but it's tokenized as a normal string when using llama.cpp. I got something like this instead:

```
main: prompt: ' </s>'
main: number of tokens in prompt = 4
     1 -> ''
  1533 -> ' </'
 29879 -> 's'
 29958 -> '>'
```

I found these docs when looking for a reference of how the prompt should look like for Vicuna 1.1, to check if `</s>` should appear in the prompt:
https://github.com/lm-sys/FastChat/blob/7ae721fa3c881e1e24cf181305d127a316acd463/docs/vicuna_weights_version.md#example-prompt-weight-v11

```
A chat between a user and an assistant.

USER: Hello!
ASSISTANT: Hello!</s>
USER: How are you?
ASSISTANT: I am good.</s>
```

The docs at the end mention a special_tokens_map.json file that has something like this, but it doesn't seem to be used by convert.py:
``` json
  "eos_token": {
    "content": "</s>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vicuna 1.1 / special_tokens_map.json support #1812

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Vicuna 1.1 / special_tokens_map.json support #1812

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions