-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add link to Dart binding for llama.cpp #4882
Conversation
netdur
commented
Jan 11, 2024
- This is Flutter/Dart binding for llama.h
- two high level classes for easier usage
- pre-built binaries for iOS / macos
- WIP C binding for common.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link for convenience: https://github.com/netdur/llama_cpp_dart
Downloading and signing a pre-built binary (for xcode) from a random CI server is a security problem, considering that everything is open source and there's no reason to do that. I think Flutter supports FFI on all platforms except web. I looked at your llama_common_c repo, and I don't see a typical FFI plugin setup with llama.cpp as a submodule in src
.
MAID uses this approach for Android. I think it would be possible to use that approach to write a Flutter FFI plugin for llama.cpp that works cross-platform.
However, as far as I can gather, you can't combine server-side dart FFI plugins with Flutter FFI plugins. I've been working on dart bindings for llama.cpp,, and that requires a dev build of dart and the experimental native_assets
system.
On another note, I think that having the package be officially released on https://pub.dev in a decent state with basic documentation should be a requirement before we link it in the README.
(There's already a llama_flutter
and llama_dart
on pub.dev for llama.cpp, but I couldn't tell if it's functional or not. The lack of documentation and updates in 10 months makes me think it's not worth trying to use it.)
@crasm, thanks for the insightful review. To clarify, llama_common_c mainly deals with I understand your concern regarding the use of pre-built binaries? It's true that iOS binary is signed on the CI server, but I also offer an alternative for those who prefer building from source. In the llama_common_c repository, there's a script that allows users to build the binaries themselves, using their local copy of llama.cpp. This approach ensures that users have the choice to either use the pre-built binary for convenience or compile the binary themselves for greater transparency and security. The Your mention of MAID's approach is awesome, thank you that suggestion. I might consider contributing to MAID instead of managing a separate project. Thanks again for your guidance! |
I'm thinking we can wait until you have the other platforms working, or add a caveat that it's only for macOS/iOS for now. With how you're wrapping common.h, that should make it easier to keep your package up to date with new llama.cpp features. (In my dart bindings, I haven't implemented CFG and had to reimplement much of the logic around sampling.) I would still prefer to have the build integrated with native_assets_cli instead of needing a separate system. I found sherpa, which uses native_assets in combination with Flutter which may provide some inspiration (though it's not as complete as yours, I believe). Though just getting a dynamic library may be easier to integrate for most devs. Also pinging @cebtenzzre for approval. |
Er... this isn't supposed to be a public API. It's meant to be used for internal examples and tests only. |
@cebtenzzre, my use of @crasm absolutely, I hope you will continue exploring |
llama_sampling_params has C++ types because it's part of the examples, which are written in C++. Context length and RoPE scaling are set via llama_context_params, and you can decode as many batches of prompt as you want until you hit that limit. This API isn't internal to llama.cpp, it's internal to the examples and tests. There's nothing it does that isn't available to downstream users of the public llama.cpp API. |
I've replaced the C binding with a pure Dart implementation, now solely referencing llama.h. Additionally, I've documented the classes and published the package (note: it's a package, not a plugin) on pub.dev. |
@netdur looks good! I'll approve after I give it a try locally |
@netdur I was unable to use your package successfully in an example project:
|
@crasm thanks for the testing! I appreciate it. I'll make sure to update the code as you suggest and test the package locally. |
@crasm I've updated the package to include your suggestion. Although I've tested loading the package from pub.dev, further testing is still needed. Feel free to check it out now. |
@crasm it is good to go, please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and works on my end.
I've been following all implementations you've done (Thanks!) and in the first one I was able to configure P, K and temp. Here the new implementation miss those. Can we expect a mirror of |
@Solido, Yes because I've shifted from C to a full Dart implementation. I plan to integrate all the suggested features. Could you please open a feature request issue detailing the parameters you need? that way you keep track of implement. |
I would rather follow your roadmap because not all parameters maybe integrated easily. |