Skip to content
Discussion options

You must be logged in to vote

Did you download the latest quants (there was a bug that was fixed recently in llama.cpp that caused issues like looping in Unsloth's quants) - See Jan 21 update: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF

Try setting the flag --parallel 1 to ensure there isn't any parallel requests going on.

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@wey-gu
Comment options

@wey-gu
Comment options

@wey-gu
Comment options

Answer selected by wey-gu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants