Skip to content
This repository was archived by the owner on Jan 11, 2025. It is now read-only.

Commit

Permalink
update for recent llama.cpp updates
Browse files Browse the repository at this point in the history
  • Loading branch information
rizitis committed Jun 1, 2024
1 parent 4938a74 commit 41fe3dc
Show file tree
Hide file tree
Showing 3 changed files with 41 additions and 37 deletions.
27 changes: 12 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
update 1/6/24 after recent updates of llama.cpp script need some edits. will do what needed soon

Script is based on [GratisStudio](https://github.com/3Simplex/GratisStudio/blob/main/LlamaCpp/Quantizing_with_LlamaCpp.md) HowTo for windows.
It is tested on Slackware64 current systems without issues. If you found a bug please open an issue.

Expand Down Expand Up @@ -31,23 +29,22 @@ Normally all other needs should be by default in your distro, if not..when scrip


## USAGE
1. When you find the LL model you want from [https://huggingface.co](https://huggingface.co)<br>
Copy model url, then; <br>
2. Open script with your favore text editor (emacs,vim,nano,gedit etc..)<br>
Find this line and replace url with yours.
```
#---------------------------------------------------------------------------------------------------------------------#
MODEL_URL=https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B #<---Replace or add your model repo URL
#---------------------------------------------------------------------------------------------------------------------#
1. Make script executable if not...<br>
```
chmod +x quantizing_ai_models.sh
```
2. Find the LL model you want from [https://huggingface.co](https://huggingface.co)<br>
Copy ONLY the provided url for git clone, example: <br>
![copy url](./model-url.png)
3. Now exexute script in terminal following by model url example:
```
./quantizing_ai_models.sh https://huggingface.co/Cadenza-Labs/dolphin-llama3-8B-sleeper-agent-standard-l
```


3. Next move is to make script executable if not...<br>
`chmod +x quantizing_ai_models.sh`<br>

4. Finaly run script `./quantizing_ai_models.sh`

5. Just answer questions if needed and wait for results...
5. Just answer questions when needed and wait for results...

6. If you have success 👊 you can now load your model.gguf using gpt4all app.

Expand Down
Binary file added model-url.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 29 additions & 22 deletions quantizing_ai_models.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -23,21 +23,21 @@
# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#
#
# *************************************************************************#
# ========== Needed:============== #
## python3.11 --> {numpy,sentencepiece,gguf} #
## GPT4All(LLM environment)-> https://github.com/rizitis/GPT4All.SlackBuild#
## https://gpt4all.io/index.html <-- #
## git lfs #
# =========================================================================#
# #
# ========= OPTIONAL:============= #
## Vulkan SDK (AMD GPU Support) #
## Cuda toolkit (Nvidia GPU Support) #
# *************************************************************************#
# ****************************************************************************#
# ========== Needed:============== #
## 1. python3.11 --> {numpy,sentencepiece,gguf} #
## 2. GPT4All(LLM environment)-> https://github.com/rizitis/GPT4All.SlackBuild#
## https://gpt4all.io/index.html OR from your package manager #
## 3. git lfs #
# ============================================================================#
# #
# ========= OPTIONAL:============= #
## Vulkan SDK (AMD GPU Support) #
## Cuda toolkit (Nvidia GPU Support) #
# ****************************************************************************#

#---------------------------------------------------------------------------------------------------------------------#
MODEL_URL=https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1 #<---Replace or add your model repo URL
MODEL_URL="$1" #<---Replace or add your model repo URL. Else execute script like this: ./quantizing_ai_models.sh <https://huggingface.co/.... >
#---------------------------------------------------------------------------------------------------------------------#

if [ "$(id -u)" -eq 0 ]; then
Expand Down Expand Up @@ -122,9 +122,10 @@ fi

cd "$CWD"/models || exit 1

#git lfs install
#git clone "$MODEL_URL"

git lfs install
set +e
git clone "$MODEL_URL"
set -e # we dont need what is disabled for security reasons but we also dont like script to stop :)
# Lets use some of the hidden power bash scripting has ;)
# Get all models directories
MATCHING_DIRS=$(find . -maxdepth 1 -type d)
Expand Down Expand Up @@ -165,7 +166,7 @@ else
exit 69
fi

echo -e "${BLUE}Are you converting a Llama model or mistral? $TARGET_DIR (llama/mistral):${RESET}"
echo -e "${BLUE}Are you converting a llama model or mistral? $TARGET_DIR (llama/mistral):${RESET}"
read BPE_LLAMA_MISTRAL

if [ "$BPE_LLAMA_MISTRAL" == "llama" ]; then
Expand All @@ -175,15 +176,17 @@ echo -e "${BLUE}Are you converting a Llama3 model? $TARGET_DIR (yes/no):${RESET}
read BPE_FILE_FOUND



# After last changes in lamma.cpp I will keep this here for a wile just for people that dont update their lamma.cpp (91,92)
# If you dont have a very importand reason then suggested to follow llamm.cpp updates...
# I will keep convert.py here but not for ever special if script some day break I will absolutly remove it.
if [ "$BPE_FILE_FOUND" == "yes" ]; then
echo -e "${GREEN}Yupiii, Llama3 model found: $BPE_FILE_FOUND ${RESET}"
cd "$CWD" || exit 1
if python3 convert.py models/"$TARGET_DIR"/ --outtype f16 --vocab-type bpe; then
echo -e "${GREEN}Conversion successful using convert.py${RESET}"
else
echo -e "${RED}Conversion using convert.py failed, trying alternative...${RESET}"
if python3 convert-hf-to-gguf.py models/"$TARGET_DIR"/ --outtype f16; then
if python3 convert-hf-to-gguf.py --outtype f16 models/"$TARGET_DIR"/; then
echo -e "${GREEN}Conversion successful using convert-hf-to-gguf.py${RESET}"
else
echo -e "${RED}Both conversion methods failed${RESET}"
Expand All @@ -197,7 +200,7 @@ else
echo -e "${GREEN}Conversion successful using convert.py${RESET}"
else
echo -e "${RED}Conversion using convert.py failed, trying alternative...${RESET}"
if python3 convert-hf-to-gguf.py models/"$TARGET_DIR"/ --outtype f16; then
if python3 convert-hf-to-gguf.py --outtype f16 models/"$TARGET_DIR"/; then
echo -e "${GREEN}Conversion successful using convert-hf-to-gguf.py${RESET}"
else
echo -e "${RED}Both conversion methods failed${RESET}"
Expand Down Expand Up @@ -250,11 +253,13 @@ echo "MISTRAL..."
sleep 2
cd "$CWD" || exit 1
# Convert to fp16
#fp16=".fp16.bin"
# convert.py is removed ... so we use examples/convert-legacy-llama.py
# If you havent update you llama.cpp and script fail uncomment next line and comment the next one:

#python3 convert.py models/"$TARGET_DIR"/ --pad-vocab --outtype f16
python3 examples/convert-legacy-llama.py models/"$TARGET_DIR"/ --pad-vocab --outtype f16


mv "$CWD"/models/"$TARGET_DIR"/*.gguf "$CWD"/build/bin/ggml-model-f16.gguf || exit 12

# Quantize the model for each method in the QUANTIZATION_METHODS list
Expand Down Expand Up @@ -285,8 +290,9 @@ else
# Check if the rename (mv) command was successful
if [ $? -eq 0 ]; then
echo -e "${GREEN}File renamed to ${TARGET_DIR}-Q4_0.gguf ${RESET}"
echo -e "${GREEN}Model moved to llama.cpp/build/bin/${RESET}"
else
echo -e "${RED}Error: Failed to rename file.${RESET}"
echo -e "${RED}Error: Failed to rename or model is not moved to llama.cpp/build/bin/${RESET}"
exit 3
fi
fi
Expand All @@ -296,6 +302,7 @@ fi
echo -e "${GREEN}SUCCESS...${RESET}"
echo ""
echo ""
echo "You can now load llama.cpp/build/bin/${TARGET_DIR}-Q4_0.gguf using:"
cat << "EOF"
.----------------. .----------------. .----------------. .----------------. .----------------. .----------------. .----------------.
| .--------------. || .--------------. || .--------------. || .--------------. || .--------------. || .--------------. || .--------------. |
Expand Down

0 comments on commit 41fe3dc

Please sign in to comment.