Skip to content

Commit 38c5d16

Browse files
feat(docs): updating the documentation on fine tuning and advanced guide. (#5420)
updating the documentation on fine tuning and advanced guide. This mirrors how modern version of llama.cpp operate
1 parent ef6fc05 commit 38c5d16

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

docs/content/docs/advanced/fine-tuning.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -118,19 +118,18 @@ And we convert it to the gguf format that LocalAI can consume:
118118

119119
# Convert to gguf
120120
git clone https://github.com/ggerganov/llama.cpp.git
121-
pushd llama.cpp && make GGML_CUDA=1 && popd
121+
pushd llama.cpp && cmake -B build -DGGML_CUDA=ON && cmake --build build --config Release && popd
122122

123123
# We need to convert the pytorch model into ggml for quantization
124124
# It crates 'ggml-model-f16.bin' in the 'merged' directory.
125-
pushd llama.cpp && python convert.py --outtype f16 \
126-
../qlora-out/merged/pytorch_model-00001-of-00002.bin && popd
125+
pushd llama.cpp && python3 convert_hf_to_gguf.py ../qlora-out/merged && popd
127126

128127
# Start off by making a basic q4_0 4-bit quantization.
129128
# It's important to have 'ggml' in the name of the quant for some
130129
# software to recognize it's file format.
131-
pushd llama.cpp && ./quantize ../qlora-out/merged/ggml-model-f16.gguf \
132-
../custom-model-q4_0.bin q4_0
130+
pushd llama.cpp/build/bin && ./llama-quantize ../../../qlora-out/merged/Merged-33B-F16.gguf \
131+
../../../custom-model-q4_0.gguf q4_0
133132

134133
```
135134

136-
Now you should have ended up with a `custom-model-q4_0.bin` file that you can copy in the LocalAI models directory and use it with LocalAI.
135+
Now you should have ended up with a `custom-model-q4_0.gguf` file that you can copy in the LocalAI models directory and use it with LocalAI.

0 commit comments

Comments
 (0)