Open
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.
I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as Five, Four, Thre
[sic], the model will often ignore the typo and suggest , Two
(forming Thre, Two
).
I think, as an option to the /completion
server API, the following optional behavior would be useful:
- Tokenize the text
- Chop off the last token
- Run the prediction with the remaining tokens, but only consider those tokens whose bytes start with the bytes of the last token.
Thanks!
Activity