Description
Hi! I was trying to get the latest version of LLamaSharp working in Unity.
This is my script:
using System.Collections.Generic;
using UnityEngine;
using LLama.Common;
using LLama;
public class LLaMASharpTest : MonoBehaviour
{
// Start is called before the first frame update
void Start()
{
DoTalkWithLlamaCpp("Who made Linux?");
}
private async void DoTalkWithLlamaCpp(string userRequest)
{
string modelPath = "path/to/llama-2-7b-guanaco-qlora.Q2_K.gguf";
var parameters = new ModelParams(modelPath)
{
ContextSize = 1024,
Seed = 1337
};
using var model = LLamaWeights.LoadFromFile(parameters);
using var context = model.CreateContext(parameters);
var executor = new InteractiveExecutor(context);
var session = new ChatSession(executor);
await foreach (var text in session.ChatAsync(userRequest, new InferenceParams() { Temperature = 0.6f, AntiPrompts = new List<string> { "User:" }, MaxTokens = 100 }))
{
Debug.Log(text);
}
}
}
I am on Unity 2022.3, .NET Standard 2.1 and am using TheBloke/llama-2-7B-Guanaco-QLoRA-GGUF as it is listed under 'Verified Model Resources' in the readme. My code is mostly derived from ChatSessionWithRoleName.cs.
I did make a few changes in the LLamaSharp code -
- In
LLama\Exceptions\GrammarFormatExceptions.cs
I changed
namespace LLama.Exceptions;
public abstract class GrammarFormatException
...
to
using namespace LLama.Exceptions
{
public abstract class GrammarFormatException
...
}
- Similar changes in
EncodingExtensions.cs
,LLamaBeamsState.cs
,LLamaBeamView.cs
andNativeApi.BeamSearch.cs
- Removed all references of
using Microsoft.Extensions.Logging;
andILogger
. - Replaced calls to
ILogger
instances with UnityEngine'sDebug.Log()
,Debug.LogWarning()
andDebug.LogError()
. - Replaced all
#if NETSTANDARD2_0
and#if !NETSTANDARD2_0
with#if !NETSTANDARD2_1
and#if NETSTANDARD2_1
respectively, as I believe they are compatible. - In
LLamaContext.cs
, I replacedvar last_n_array = lastTokens.TakeLast(last_n_repeat).ToArray();
withvar last_n_array = IEnumerableExtensions.TakeLast(lastTokens, last_n_repeat).ToArray();
as I was getting the errorThe call is ambiguous between the following methods or properties: 'System.Linq.Enumerable.TakeLast<TSource>(System.Collections.Generic.IEnumerable<TSource>, int)' and 'LLama.Extensions.IEnumerableExtensions.TakeLast<T>(System.Collections.Generic.IEnumerable<T>, int)'
.
Please do tell me if I did something wrong.
Thanks in advance!
Edit
If I ever decrease ContextSize
from 1024 or increase MaxTokens
to above ~150, Unity just crashes. I have narrowed the crash down to
public bool Eval(ReadOnlySpan<int> tokens, int n_past, int n_threads)
{
unsafe
{
fixed (int* pinned = tokens)
{
return NativeApi.llama_eval_with_pointer(this, pinned, tokens.Length, n_past, n_threads) == 0;
}
}
}
in SafeLLamaContextHandle.cs
.
Update
I downloaded LLamaSharp again and compiled the example project (LLama.Examples) with the same llama-2-7b-guanaco-qlora.Q2_K.gguf
model and it works! So my issue has something to do with my changes or with the C# environment that Unity uses.