Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any tutorial to convert pytroch model to gguf? #705

Closed
BayRanger opened this issue Jan 24, 2024 · 4 comments
Closed

Any tutorial to convert pytroch model to gguf? #705

BayRanger opened this issue Jan 24, 2024 · 4 comments

Comments

@BayRanger
Copy link

Hi, thanks for this awesome lib, and to convert a self-designed pytorch model to gguf file/model, is there any turtorial given as reference?

Best regards
HCX

@YavorGIvanov
Copy link
Collaborator

I think you just have to use the convert.py script as described in the llama.cpp README:

# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model
  # [Optional] for models using BPE tokenizers
  ls ./models
  65B 30B 13B 7B vocab.json

# install Python dependencies
python3 -m pip install -r requirements.txt

# convert the 7B model to ggml FP16 format
python3 convert.py models/7B/ 

@BayRanger
Copy link
Author

Thanks for the reply, besides, other than model parameters, how to convert the designed model structure to ggml format?

@BayRanger BayRanger reopened this Jan 25, 2024
@tak2hu
Copy link

tak2hu commented Jan 25, 2024

A bit unrelated, I tried converting a (pytorch) safetensors model into ggml by following the gguf-py example.
You could adapt this for pytorch by replacing it with a pytorch state dictionary.

from safetensors import safe_open
from gguf import GGUFWriter
import numpy as np

model_path = "model.safetensors"
model_st = safe_open(model_path, 'numpy')

gguf_writer = GGUFWriter("model.gguf", "example arch")

# metadata
gguf_writer.add_block_count(3)
gguf_writer.add_uint32("answer", 42)  # Write a 32-bit integer
gguf_writer.add_float32("answer_in_float", 42.0)  # Write a 32-bit float

for layer in model_st.keys():
    tmp_tensor = model_st.get_tensor(layer)
    gguf_writer.add_tensor(layer, tmp_tensor)

# write header
gguf_writer.write_header_to_file()
# write metadata
gguf_writer.write_kv_data_to_file()
# write tensors
gguf_writer.write_tensors_to_file()

gguf_writer.close()

I'm still trying to figure out the next step after loading the gguf model by following this example (i.e building the compute graph)

@BayRanger
Copy link
Author

Well, since there is no more further feedback, regarding my question, I guess the answer is that there is no formal turtorial to generate a causal model to inference in gguf fromat. I think there is no challenge to convert the weights from pth file to gguf style, but there is some effort to load the weights into the inference framework, because you have to be familiar with the framework, and code the framework by yourserf! I plan to close this issue soon, but if my opinion is wrong, pls let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants