-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Could you please provide source code on github, so it is easy to patch and re-compile from source? Or upstream the Ampere specific optimization to the llama.cpp project? Currently, the optimized code is useless for many of the new models due to bugfixes not included in this old version and out of date code.
Compare how easy it is for stock llama.cpp to run fast on AWS Graviton chips: https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/llama.cpp.md
naus3a, nilleb, lu-zero, geerlingguy, azasypkin and 3 more
Metadata
Metadata
Assignees
Labels
No labels