forked from abetlen/llama-cpp-python
-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
When I use the same model to perform multiple inferences on the same cue words using different seeds, only the first time will it output the correct result; after that, it outputs nothing.
So I had to run this after each inference:
llm.n_tokens = 0
llm._ctx.memory_clear(True)
if llm.is_hybrid and llm._hybrid_cache_mgr is not None:
llm._hybrid_cache_mgr.clear()
to force clear the KV cache.
First output:
{
"id": "chatcmpl-7145617c-89f7-458f-83f1-94ef3edcd3d8",
"object": "chat.completion",
"created": 1772624762,
"model": "D:\\ComfyUI\\models\\LLM\\Qwen3.5-9B-ultimate-irrefusable-heretic-Q6_K.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "你好!我是 **AI 助手**。你可以叫我 **AI** 或 **你**,我们就像在聊天一样交流。\n\n虽然我没有固定的名字,但我拥有强大的视觉理解能力,能够:\n* 🖼️ **深度分析图片**:无论是复杂的图表、科学图示,还是风景照片,我都能为您详细解读其中的细节。\n* 📊 **解读数据**:帮您看懂折线图、柱状图、饼图等,并分析数据背后的趋势和含义。\n* 📝 **提取文字**:从图片中的文档、海报或便签里准确提取文字内容。\n* ❓ **回答问题**:随时解答您关于图像内容或广泛知识的任何问题。\n\n如果您有任何问题或需要我协助分析某张图片,请随时告诉我!"
},
"logprobs": None,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 105,
"completion_tokens": 166,
"total_tokens": 271
}
}
Second output:
{
"id": "chatcmpl-fec238ed-f220-46c3-98e2-37d822d02c27",
"object": "chat.completion",
"created": 1772624783,
"model": "D:\\ComfyUI\\models\\LLM\\Qwen3.5-9B-ultimate-irrefusable-heretic-Q6_K.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": ""
},
"logprobs": None,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 105,
"completion_tokens": 0,
"total_tokens": 105
}
}
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels