Skip to content

Please upgrade the KV cache size yes using --ctx-size #6617

@enn-nafnlaus

Description

@enn-nafnlaus
          Please upgrade the KV cache size yes using `--ctx-size`

Originally posted by in #6603 (comment)

This is not an appropriate response to people having this problem. --ctx-size is a memory-limited operation; of course we'd set it higher if we could. Mine is at 16k and I still hit this problem.

The appropriate response to running out of tokens is to fail the query. It's not for the server to go into an infinite loop and stop all further processing. I've lost days worth of processing time to this bug, when I log into my server and discover that it's no longer running because of this.

The server should never go into an infinite loop; I mean, obviously? If it can't handle a query, it should just reject it and move on.

EDIT: The folk was just running a very outdated server version. Always use --n-predict N to avoid infinite loop.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions