Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf_init_from_file: invalid magic characters #3905

Closed
GermanAizek opened this issue Nov 2, 2023 · 9 comments
Closed

gguf_init_from_file: invalid magic characters #3905

GermanAizek opened this issue Nov 2, 2023 · 9 comments

Comments

@GermanAizek
Copy link
Contributor

Download model from here: https://huggingface.co/NousResearch/Nous-Hermes-Llama2-13b-GGML/tree/main

OS: Windows
Commit: 2756c4f

PS D:\llama.cpp> .\build\bin\Release\quantize.exe .\models\7B\ggml-model-f16.bin .\models\7B\ggml-model-q4_0.bin 2
main: build = 1470 (2756c4f)
main: built with MSVC 19.37.32822.0 for x64
main: quantizing '.\models\7B\ggml-model-f16.bin' to '.\models\7B\ggml-model-q4_0.bin' as Q4_0
gguf_init_from_file: invalid magic characters tjgg9☺.
llama_model_quantize: failed to quantize: llama_model_loader: failed to load model from .\models\7B\ggml-model-f16.bin

main: failed to quantize model from '.\models\7B\ggml-model-f16.bin'
@staviq
Copy link
Contributor

staviq commented Nov 2, 2023

You need a GGUF model file.

https://huggingface.co/models?search=hermes%20gguf

EDIT: You can try using convert-llama-ggml-to-gguf.py but typically, you'd want to get the source model and convert it directly to gguf, and then quantize.

@SatoX69
Copy link

SatoX69 commented Dec 16, 2023

What about bin files?

@sultanmahood
Copy link

i run this command to load llama 2 modal by this command
npx --no node-llama-cpp chat --model './llama.cpp'
but i got an error

npx --no node-llama-cpp chat --model './llama.cpp'
gguf_init_from_file: invalid magic characters 'w'
error loading model: llama_model_loader: failed to load model from /llama.cpp

llama_load_model_from_file: failed to load model
Error: Failed to load model
at new LlamaModel (file:/node-llama-cpp

@staviq
Copy link
Contributor

staviq commented Jan 9, 2024

i run this command to load llama 2 modal by this command npx --no node-llama-cpp chat --model './llama.cpp' but i got an error

In not familiar with node bindings for llama.cpp, but I'm pretty sure --model means model file, an actual particular LLM model, which typically would be a single file with .gguf extension

@HansvanHespen
Copy link

HansvanHespen commented Jan 14, 2024

Did I miss something? I downloaded my ggml .bin file 4 months ago, still works fine locally, and now, when running on the cloud, I suddenly need a gguf file? Why? Is gguf newer, better, more compact, faster? Why does it run locally and not in the cloud? Difference between Windows (local) and cloud (linux)? Although the requirements.txt file a newer version of python-llama-cpp downloaded on the cloud? Has it something to do with cuBLAS?

@abc-nix
Copy link

abc-nix commented Jan 15, 2024

Did I miss something? I downloaded my ggml .bin file 4 months ago, still works fine locally, and now, when running on the cloud, I suddenly need a gguf file? Why?

See #2398 GGUF replaces ggml.bin files. I think only koboldcpp still supports the older file, but not the official llama.cpp. Read the merge request I linked for more info.

PR #2682 added a script to convert from ggml to GGUFv1. Not sure if it can also convert to the newer version.

Your local llama.cpp must be an older version which still supported ggml. If you are happy with it, keep it, but any new installations (local or in the cloud) will only support gguf. Try koboldcpp for compatibility.

@HansvanHespen
Copy link

Thank you abc-nix. I did indeed miss something. A couple of days after I installed the ggml files, gguf came out. I downloaded new files of everything, the llama-cpp-python package, the gguf weights file, a new Visual studio 2022, etc. It all works fine now...on my windows pc. But if I upload on the fly.io cloud, it crashes hard on llm = Llamacpp(modelpath="xxx.gguf").

No explanation, no message, just the scary sigill message: "Main child exited with signal (with signal 'SIGILL', core dumped? false)". Supposedly an illegal instruction due to cross compiling. A cpu feature my cpu would have and not the fly.io cpu...

I've been working on it for a couple of days now, without success.

High regards,

Hans

@alllexx88
Copy link

@HansvanHespen I had an Illegal instruction error when I tried to use llama.cpp built with AVX2 instructions on an older server, which had AVX, but no AVX2

@GermanAizek
Copy link
Contributor Author

GermanAizek commented Feb 12, 2024

I had an Illegal instruction error when I tried to use llama.cpp built with AVX2 instructions on an older server, which had AVX, but no AVX2

@alllexx88,
Possible emulate AVX2 on AVX instructions.

If Windows Server or user version Windows and Intel CPU, then:

  1. Download Intel Software Development Emulator (sde.exe) from the website (https://www.intel.ru/content/www/en/en/download/684897/intel-software-development-emulator.html ).

  2. Unzip to any location. I unpacked it into bin folder.

  3. On file sde.exe and right-click to create shortcut to desktop.

  4. Right-click on "Properties" shortcut.

    In line "Object" we write "D:\GIT\llama.cpp\bin\sde-external-9.7.0-2022-05-09-win\sde.exe " -hsw -- "D:\GIT\llama.cpp\bin\llama_avx2_binary.exe"

    That is, first part, path to file sde.exe. Second is path to binary with require AVX2.

    After -hsw -- (two cons required)

    Shortcut, run on behalf administrator and wait 1-3 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants