-
Notifications
You must be signed in to change notification settings - Fork 12.6k
Description
Please upgrade the KV cache size yes using `--ctx-size`
Originally posted by in #6603 (comment)
This is not an appropriate response to people having this problem. --ctx-size is a memory-limited operation; of course we'd set it higher if we could. Mine is at 16k and I still hit this problem.
The appropriate response to running out of tokens is to fail the query. It's not for the server to go into an infinite loop and stop all further processing. I've lost days worth of processing time to this bug, when I log into my server and discover that it's no longer running because of this.
The server should never go into an infinite loop; I mean, obviously? If it can't handle a query, it should just reject it and move on.
EDIT: The folk was just running a very outdated server version. Always use --n-predict N
to avoid infinite loop.