Skip to content

Garbled output from model in Unity #178

Closed
@Uralstech

Description

@Uralstech

Hi! I was trying to get the latest version of LLamaSharp working in Unity.

This is my script:

using System.Collections.Generic;
using UnityEngine;
using LLama.Common;
using LLama;

public class LLaMASharpTest : MonoBehaviour
{
    // Start is called before the first frame update
    void Start()
    {
        DoTalkWithLlamaCpp("Who made Linux?");
    }

    private async void DoTalkWithLlamaCpp(string userRequest)
    {
        string modelPath = "path/to/llama-2-7b-guanaco-qlora.Q2_K.gguf";

        var parameters = new ModelParams(modelPath)
        {
            ContextSize = 1024,
            Seed = 1337
        };
        using var model = LLamaWeights.LoadFromFile(parameters);
        using var context = model.CreateContext(parameters);
        var executor = new InteractiveExecutor(context);

        var session = new ChatSession(executor);

        await foreach (var text in session.ChatAsync(userRequest, new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" }, MaxTokens = 100 }))
        {
            Debug.Log(text);
        }
    }
}

And this is my output:
image

I am on Unity 2022.3, .NET Standard 2.1 and am using TheBloke/llama-2-7B-Guanaco-QLoRA-GGUF as it is listed under 'Verified Model Resources' in the readme. My code is mostly derived from ChatSessionWithRoleName.cs.

I did make a few changes in the LLamaSharp code -

  • In LLama\Exceptions\GrammarFormatExceptions.cs I changed
namespace LLama.Exceptions;

public abstract class GrammarFormatException
...

to

using namespace LLama.Exceptions
{
    public abstract class GrammarFormatException
    ...
}
  • Similar changes in EncodingExtensions.cs, LLamaBeamsState.cs, LLamaBeamView.cs and NativeApi.BeamSearch.cs
  • Removed all references of using Microsoft.Extensions.Logging; and ILogger.
  • Replaced calls to ILogger instances with UnityEngine's Debug.Log(), Debug.LogWarning() and Debug.LogError().
  • Replaced all #if NETSTANDARD2_0 and #if !NETSTANDARD2_0 with #if !NETSTANDARD2_1 and #if NETSTANDARD2_1 respectively, as I believe they are compatible.
  • In LLamaContext.cs, I replaced var last_n_array = lastTokens.TakeLast(last_n_repeat).ToArray(); with var last_n_array = IEnumerableExtensions.TakeLast(lastTokens, last_n_repeat).ToArray(); as I was getting the error The call is ambiguous between the following methods or properties: 'System.Linq.Enumerable.TakeLast<TSource>(System.Collections.Generic.IEnumerable<TSource>, int)' and 'LLama.Extensions.IEnumerableExtensions.TakeLast<T>(System.Collections.Generic.IEnumerable<T>, int)'.

Please do tell me if I did something wrong.

Thanks in advance!

Edit

If I ever decrease ContextSize from 1024 or increase MaxTokens to above ~150, Unity just crashes. I have narrowed the crash down to

public bool Eval(ReadOnlySpan<int> tokens, int n_past, int n_threads)
{
    unsafe
    {
        fixed (int* pinned = tokens)
        {
            return NativeApi.llama_eval_with_pointer(this, pinned, tokens.Length, n_past, n_threads) == 0;
        }
    }
}

in SafeLLamaContextHandle.cs.

Update

I downloaded LLamaSharp again and compiled the example project (LLama.Examples) with the same llama-2-7b-guanaco-qlora.Q2_K.gguf model and it works! So my issue has something to do with my changes or with the C# environment that Unity uses.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions