-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LlamaCpp model crashes with multi-token characters #934
Comments
Hi @knilink , thanks for reporting this! Do you know if this happens if you try to generate with llama-cpp-python directly? Getting the full stack trace here would be very helpful! @paulbkoch might have thoughts here too |
Hi @Harsha-Nori I did a bit more investigation and can confirm the error was caused by sending incomplete Unicode bytes to llama_cpp tokenizer $ printf '\xe6\xad' | ./llama-tokenize -m ./Meta-Llama-3-8B-Instruct.Q8_0.gguf --stdin
terminate called after throwing an instance of 'std::invalid_argument'
what(): invalid character
Aborted After adding
I got the following stack trace:
Transformer model didn't have the issue because its |
@knilink , thank you for bringing this up. I've drafted a (very) tentative fix in #962 , which works by chopping off bytes given to the Have you filed your repro |
I have been doing some more prodding based on @knilink 's examples, and I've opened a bug on the HF repo whence I grabbed the model (although this does look like something going wrong at the LlamaCpp layer): |
Also filed the bug on LlamaCpp |
The bug
A strings containing certain unicode characters to causes an exception.
Likely because
歪
is a multi-token characters for this tokenizerI also tested transformers model which seems to be working fine
To Reproduce
System info (please complete the following information):
Ubuntu 22.04
Python 3.10.12
guidance==0.1.15
llama_cpp_python==0.2.79
The text was updated successfully, but these errors were encountered: