Qwen 3.5 cannot perform repeated dialogues.

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [ ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the [README.md](https://github.com/abetlen/llama-cpp-python/blob/main/README.md).
- [x] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/abetlen/llama-cpp-python/discussions), and have a new bug or useful enhancement to share.  

---

When I use the same model to perform multiple inferences on the same cue words using different seeds, only the first time will it output the correct result; after that, it outputs nothing.
So I had to run this after each inference:
```
llm.n_tokens = 0
llm._ctx.memory_clear(True)
if llm.is_hybrid and llm._hybrid_cache_mgr is not None:
    llm._hybrid_cache_mgr.clear()
```
to force clear the KV cache.

---

First output:
```
{
    "id": "chatcmpl-7145617c-89f7-458f-83f1-94ef3edcd3d8",
    "object": "chat.completion",
    "created": 1772624762,
    "model": "D:\\ComfyUI\\models\\LLM\\Qwen3.5-9B-ultimate-irrefusable-heretic-Q6_K.gguf",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "你好！我是 **AI 助手**。你可以叫我 **AI** 或 **你**，我们就像在聊天一样交流。\n\n虽然我没有固定的名字，但我拥有强大的视觉理解能力，能够：\n*   🖼️ **深度分析图片**：无论是复杂的图表、科学图示，还是风景照片，我都能为您详细解读其中的细节。\n*   📊 **解读数据**：帮您看懂折线图、柱状图、饼图等，并分析数据背后的趋势和含义。\n*   📝 **提取文字**：从图片中的文档、海报或便签里准确提取文字内容。\n*   ❓ **回答问题**：随时解答您关于图像内容或广泛知识的任何问题。\n\n如果您有任何问题或需要我协助分析某张图片，请随时告诉我！"
            },
            "logprobs": None,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 105,
        "completion_tokens": 166,
        "total_tokens": 271
    }
}
```
Second output:
```
{
    "id": "chatcmpl-fec238ed-f220-46c3-98e2-37d822d02c27",
    "object": "chat.completion",
    "created": 1772624783,
    "model": "D:\\ComfyUI\\models\\LLM\\Qwen3.5-9B-ultimate-irrefusable-heretic-Q6_K.gguf",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": ""
            },
            "logprobs": None,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 105,
        "completion_tokens": 0,
        "total_tokens": 105
    }
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 3.5 cannot perform repeated dialogues. #76

Prerequisites

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Qwen 3.5 cannot perform repeated dialogues. #76

Description

Prerequisites

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions