Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set #33246

Closed
NielsRogge opened this issue Sep 2, 2024 · 8 comments · Fixed by #33254
Closed
Labels

Comments

@NielsRogge
Copy link
Contributor

NielsRogge commented Sep 2, 2024

System Info

Transformers v4.45.0.dev0

Who can help?

@Rocketknight1

Reproduction

The code snippet from here doesn't seem to work, I assume this is because models no longer have a default chat template if they don't have it set:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("facebook/blenderbot-400M-distill")

chat = [
   {"role": "user", "content": "Hello, how are you?"},
   {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
   {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

tokenizer.apply_chat_template(chat, tokenize=False)

results in

/Users/nielsrogge/Documents/python_projecten/transformers/src/transformers/tokenization_utils_base.py:1602: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be deprecated in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Traceback (most recent call last):
  File "/Users/nielsrogge/Documents/python_projecten/transformers/src/transformers/models/blenderbot/test.py", line 10, in <module>
    tokenizer.apply_chat_template(chat, tokenize=False)
  File "/Users/nielsrogge/Documents/python_projecten/transformers/src/transformers/tokenization_utils_base.py", line 1787, in apply_chat_template
    chat_template = self.get_chat_template(chat_template, tools)
  File "/Users/nielsrogge/Documents/python_projecten/transformers/src/transformers/tokenization_utils_base.py", line 1938, in get_chat_template
    raise ValueError(
ValueError: Cannot use apply_chat_template() because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

Expected behavior

A working code snippet

@NielsRogge NielsRogge added the bug label Sep 2, 2024
@Rocketknight1
Copy link
Member

Yes, you're right about the cause! Rather than trying to merge a proper chat template for Blenderbot (which is very obsolete by now), I'll just rewrite the doc to use a different model.

@PhilipAmadasun
Copy link

@Rocketknight1 I'm getting the same error when I try to use some models like gemma. I can try to use the template parameter but not sure what the format is for gemma model (I can look it up in the tokenizer_config.json right?). Is this pretty much what we now have to do when we get this error? manually set the template? for models that dont accept "role":"system" what wwould be the work around?

@Rocketknight1
Copy link
Member

Hi @PhilipAmadasun, the most likely cause is that you're loading the base gemma models, like gemma-2-2b, instead of the models that are "instruction tuned" for chat, like gemma-2-2b-it. The base models are just simple language models and don't support chat, and therefore don't have a chat template. If you use a model trained for chat, it should work!

@Rocketknight1
Copy link
Member

Also @NielsRogge, fix has now been merged

@daidaiershidi
Copy link

Also @NielsRogge, fix has now been merged

hi, I understand the current changes, but there's still a lot of code that actually uses the default chat template. i'd like to know what the default chat tempalte was before, so i can set the “chat template” in tokenizer_config.

@Rocketknight1
Copy link
Member

Hi @daidaiershidi, using the old 'default' chat template with models that were not trained with it will produce very inaccurate results! This is a big part of the reason the default templates were removed.

Can you tell me which models you're working with that used to have default chat templates? It's possible that some of them actually should have templates, in which case we can add them.

@daidaiershidi
Copy link

Hi @daidaiershidi, using the old 'default' chat template with models that were not trained with it will produce very inaccurate results! This is a big part of the reason the default templates were removed.

Can you tell me which models you're working with that used to have default chat templates? It's possible that some of them actually should have templates, in which case we can add them.

the wizard family, none of their tokenizer_config have chat_template (https://huggingface.co/WizardLMTeam/WizardLM-13B-V1.2/blob/main/tokenizer_config.json), and others like https://huggingface.co/layoric/llama-2-13b-code-alpaca/blob/main/tokenizer_config.json

@svnv-svsv-jm
Copy link

Also @NielsRogge, fix has now been merged

hi, I understand the current changes, but there's still a lot of code that actually uses the default chat template. i'd like to know what the default chat tempalte was before, so i can set the “chat template” in tokenizer_config.

THIS.

I am running a small LLM on a small machine, just to test that the code, API works. I do not care about correct responses here. Let us run the code by forcing some template!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants