-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Revert showing control tokens by default for server OpenAI Chat completions #6860
Fix: Revert showing control tokens by default for server OpenAI Chat completions #6860
Conversation
…vide overridden declaration to receive "bool special" param to toggle showing control tokens
…mon/common.cpp to specify "false" so that control tokens are not shown in chat completion responses"
I provided an alternative solution that reverts the change to I added an overridden declaration of
|
I am open to comments, concerns and/or complaints regarding if this is the correct way to fix this problem |
Hmm i am not sure what happened but on the most up-to-date llama.cpp |
In #6807 @ggerganov added the ability to toggle showing control tokens (e.g. EOS tokens). In
common.cpp
this was set totrue
by default in two places, which broke the/v1/chat/completions
endpoint as described in #6859 - in short, the OpenAI chat completions endpoint response now includes the EOS / stop token, which is different from past behavior / expected behavior.I have confirmed that reverting the booleans to be
false
in the two places incommon.cpp
fixes this behavior.While this PR fixes the breaking change, it may affect behavior that is dependent on #6807's new default of
true
in other places. This may need to be investigated further, but I propose reverting the change for now to fix the broken/v1/chat/completions
behavior.s/o @QueryType for opening #6847 as well which was caused by the same underlying issue.
API Response before the change (ChatML model):
API Response before the change (Mistral model / llama2 template):
Correct API response after this change:
(note the absence of control tokens)