-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama : add option to render special/control tokens #6807
Conversation
Performance dropped - maybe generation does not stop properly after the #6745 EOG changes? |
Very likely, because we're using phi-2 model which does not have native support for chatml (so
|
I think we are incorrectly using a base model instead of instruction-tuned one for this test: https://huggingface.co/microsoft/phi-2 The |
Ah yeah that's right. We can use dolphin-phi2 then. Here is the link: https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF The |
* make : fix common dep on llama.h * llama : add option to render special tokens * readme : add API change notice ggml-ci * swift : fix build
fix #6770
Setting
special == true
inllama_token_to_piece()
will cause special/control tokens' text to be rendered in the output:llama.cpp/llama.h
Lines 827 to 837 in 1f45c2a