Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always getting a "Ċ" before EOS when using Qwen #865

Open
rolevax opened this issue Jul 24, 2024 · 1 comment
Open

Always getting a "Ċ" before EOS when using Qwen #865

rolevax opened this issue Jul 24, 2024 · 1 comment

Comments

@rolevax
Copy link

rolevax commented Jul 24, 2024

Description

I run a simple chatbot with Qwen and got a "Ċ" at every end of the generated texts:

User> hello
Assistant: Hello! How can I assist you today?Ċ<|im_end|>
User> who are you?
Assistant: I'm an artificial ... (blablabla) ... How can I assist you specifically today?Ċ<|im_end|>

The code is like:

using LLama.Common;
using LLama;
using LLama.Abstractions;
using System.Text;

internal class Program
{
    private static readonly string SystemMessage = """
    You are a helpful assistant.
    """;

    private static async Task Main(string[] args)
    {
        string modelPath = @"my/local/dir/qwen2-7b-instruct-q6_k.gguf";

        var parameters = new ModelParams(modelPath)
        {
            Seed = 1337,
            GpuLayerCount = 29,
        };
        using var model = LLamaWeights.LoadFromFile(parameters);
        using var context = model.CreateContext(parameters);
        var executor = new InteractiveExecutor(context);

        var chatHistory = new ChatHistory();
        chatHistory.AddMessage(AuthorRole.System, SystemMessage);

        ChatSession session = new(executor, chatHistory);
        session.WithHistoryTransform(new QwenHistoryTransform());

        InferenceParams inferenceParams = new()
        {
            MaxTokens = 256,
            AntiPrompts = ["<|im_end|>", "<|endoftext|>"],
        };

        string? userInput;
        while ((userInput = ReadLine()) != null)
        {
            var stream = session.ChatAsync(new ChatHistory.Message(AuthorRole.User, userInput), inferenceParams);
            await foreach (var text in stream)
            {
                Console.Write(text);
            }
        }
    }

    private static string? ReadLine() {
        Console.Write("\nUser> ");
        return Console.ReadLine();
    }
}

class QwenHistoryTransform : IHistoryTransform
{
    private const string Bos = "<|im_start|>";
    private const string Eos = "<|im_end|>";

    public string HistoryToText(ChatHistory history)
    {
        var sb = new StringBuilder(1024);

        foreach (var message in history.Messages)
        {
            sb.Append(Bos);
            sb.Append(GetRoleName(message.AuthorRole));
            sb.Append('\n');
            sb.Append(message.Content);
            sb.Append(Eos);
            sb.Append('\n');
        }

        return sb.ToString();
    }

    private static string GetRoleName(AuthorRole authorRole)
    {
        return authorRole switch
        {
            AuthorRole.User => "user",
            AuthorRole.Assistant => "assistant",
            AuthorRole.System => "system",
            _ => throw new Exception($"Unsupported role: {authorRole}"),
        };
    }

    public ChatHistory TextToHistory(AuthorRole role, string text)
    {
        return new ChatHistory([new ChatHistory.Message(role, text)]);
    }

    public IHistoryTransform Clone()
    {
        return new QwenHistoryTransform();
    }
}

So how can I find where the "Ċ" comes from? What is it related to?

The model is from https://huggingface.co/Qwen/Qwen2-7B-Instruct-GGUF

@sangyuxiaowu
Copy link
Contributor

#858

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants