Skip to content

Repeated greeting in same chat session #801

Closed
@delock

Description

@delock

This is a replication of the problem described in the following link #711 (comment) I met same issue so create an issue to track.

Problem statement

When chat with using the llm.create_chat_completion API, the chatbot keeps greeting and repeat what I had input before.

Expected Behavior

I'm trying to have a continue conversation with the chat bot, and I expect to have a smooth speech flow.

Current Behavior

A conversation goes like the following with a simple chat program using llama-2-7b-chat.Q5_K_M.gguf downloaded from https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF, the simple chat code is from the following PR #575:

Rob:  Hello! It's nice to meet you. Is there something I can help you with or would you like to chat about something in particular?
User: Let's talk about football
Rob:  Hello! Nice to meet you too! *adjusts avatar glasses* I'm just an AI, so I don't have personal preferences or interests, but I'd be happy to chat with you about anything you'd like. How about we talk about the latest football news? 🏈
User: I like to play football on shiny day.
Rob:  Hello! *adjusts avatar hat* It's great to meet you too! I'm just an AI, so I don't have personal preferences or interests either, but I'd be happy to chat with you about anything you'd like. How about we talk about the latest football news? 🏈
Oh, you like to play football on shiny days? That's great! *adjusts avatar sunglasses* I'm just an AI, so I don't have personal experiences, but I can certainly chat with you about football. What's your favorite team or player? 🤔

The response from 'Rob' is generated by llama-cpp-python and we can clearly 'Rob' greets me again and again.

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu
12th Gen Intel(R) Core(TM) i7-12700H, with hyper threading off

  • Operating System, e.g. for Linux:
    $ uname -a
    Linux cortex 6.5.3-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 13 Sep 2023 08:37:40 +0000 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:

$ python3 --version
Python 3.11.3
$ make --version
GNU Make 4.4.1
$ g++ --version
g++ (GCC) 13.2.1 20230801

Failure Information (for bugs)

See current behaivor part

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. Step 1, save the following as simple_chat.py
from llama_cpp import ChatCompletionMessage, Llama
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str, default="<your path to your ggml-model.bin>")
args = parser.parse_args()

llm = Llama(model_path=args.model)

# Create a list of messages
messages = [
    ChatCompletionMessage(role='system', content='start chat'),
    ChatCompletionMessage(role='user', content='Hello')
]

while True:
    # Generate a response
    response = llm.create_chat_completion(
        messages,
        temperature=0.2,
        top_p=0.95,
        top_k=40,
        stream=False,
        stop=[],
        max_tokens=256,
        repeat_penalty=1.1,
    )

    output = response['choices'][0]['message']['content']
    print(f"Rob: {output}")

    # Append assistant's message to conversation history
    messages.append(ChatCompletionMessage(role='assistant', content=output))

    user_message = input("User: ")
    messages.append(ChatCompletionMessage(role='user', content=user_message))
  1. step 2, download llama-2-7b-chat.Q5_K_M.gguf downloaded from https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF
  2. step 3
    Run `python simple_chat.py --model
  3. etc.
    Chat freely or input prompt from 'User' in section 'Current behavior'

llama-cpp-python version:
commit 43dfe1e
llama-cpp version:
commit 48edda3

Failure Logs

See 'current behavior section'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions