-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gguf_init_from_file: invalid magic characters #3905
Comments
You need a GGUF model file. https://huggingface.co/models?search=hermes%20gguf EDIT: You can try using convert-llama-ggml-to-gguf.py but typically, you'd want to get the source model and convert it directly to gguf, and then quantize. |
What about bin files? |
i run this command to load llama 2 modal by this command npx --no node-llama-cpp chat --model './llama.cpp' llama_load_model_from_file: failed to load model |
In not familiar with node bindings for llama.cpp, but I'm pretty sure |
Did I miss something? I downloaded my ggml .bin file 4 months ago, still works fine locally, and now, when running on the cloud, I suddenly need a gguf file? Why? Is gguf newer, better, more compact, faster? Why does it run locally and not in the cloud? Difference between Windows (local) and cloud (linux)? Although the requirements.txt file a newer version of python-llama-cpp downloaded on the cloud? Has it something to do with cuBLAS? |
See #2398 GGUF replaces ggml.bin files. I think only koboldcpp still supports the older file, but not the official llama.cpp. Read the merge request I linked for more info. PR #2682 added a script to convert from ggml to GGUFv1. Not sure if it can also convert to the newer version. Your local llama.cpp must be an older version which still supported ggml. If you are happy with it, keep it, but any new installations (local or in the cloud) will only support gguf. Try koboldcpp for compatibility. |
Thank you abc-nix. I did indeed miss something. A couple of days after I installed the ggml files, gguf came out. I downloaded new files of everything, the llama-cpp-python package, the gguf weights file, a new Visual studio 2022, etc. It all works fine now...on my windows pc. But if I upload on the fly.io cloud, it crashes hard on llm = Llamacpp(modelpath="xxx.gguf"). No explanation, no message, just the scary sigill message: "Main child exited with signal (with signal 'SIGILL', core dumped? false)". Supposedly an illegal instruction due to cross compiling. A cpu feature my cpu would have and not the fly.io cpu... I've been working on it for a couple of days now, without success. High regards, Hans |
@HansvanHespen I had an Illegal instruction error when I tried to use llama.cpp built with AVX2 instructions on an older server, which had AVX, but no AVX2 |
@alllexx88, If Windows Server or user version Windows and Intel CPU, then:
|
Download model from here: https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b-GGML/tree/main
OS: Windows
Commit: 2756c4f
The text was updated successfully, but these errors were encountered: