Let's collaborate

[apologies for early send, accidentally hit enter]

Hey there! Turns out we think on *extremely* similar wavelengths - I did the exact same thing as you, for the exact same reasons (libraryification), and through the use of similar abstractions: https://github.com/philpax/ggllama

Couple of differences I spotted on my quick perusal:
- My version builds on both Windows and Linux, but fails to infer correctly past the first round. Windows performance is also pretty crappy because `ggml` doesn't support multithreading on Windows.
- I use `PhantomData` with the `Tensor`s to prevent them outliving the `Context` they're spawned from.
- I vendored `llama.cpp` in so that I could track it more directly and use its `ggml.c/h`, and to make it obvious which version I was porting.

Given yours actually works, I think that it's more promising :p 

What are your immediate plans, and what do you want people to help you out with? My plan was to get it working, then librarify it, make a standalone Discord bot with it as a showcase, and then investigate using a Rust-native solution for the tensor manipulation (burn, ndarray, arrayfire, etc) to free it from the ggml dependency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Let's collaborate #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Let's collaborate #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions