We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have been experiencing this problem since I have tried different models of llama 3 for example: https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF https://huggingface.co/QuantFactory/dolphin-2.9-llama3-8b-GGUF
In all responses calling "/chat/completions" returns at the end the '<|im_end|>'.
I'm using the latest docker version for cuda: 'ghcr.io/ggerganov/llama.cpp:server-cuda'
Thanks in advance.
The text was updated successfully, but these errors were encountered:
Sorry, I didn't see that an issue was already open.
Sorry, something went wrong.
Fixed in #6860
No branches or pull requests
I have been experiencing this problem since I have tried different models of llama 3 for example:
https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
https://huggingface.co/QuantFactory/dolphin-2.9-llama3-8b-GGUF
In all responses calling "/chat/completions" returns at the end the '<|im_end|>'.
I'm using the latest docker version for cuda: 'ghcr.io/ggerganov/llama.cpp:server-cuda'
Thanks in advance.
The text was updated successfully, but these errors were encountered: