Skip to content

This project aims to create a guide for generating Llama.cpp GGUF model files out of base Llama v2 model weigths

License

Notifications You must be signed in to change notification settings

kevinknights29/Llama_to_Llama.cpp

Repository files navigation

Llama_to_Llama.cpp

This project aims to create a guide for generating Llama.cpp GGUF model files out of base Llama v2 model weigths

Table of Contents

Getting Started

To run this project locally, you have two options:

  • Use VS Code's Dev Container [Prefered].

  • Create your own virtual environment.

Using a Dev Container in VS Code

The Visual Studio Code Dev Containers extension lets you use a container as a full-featured development environment. It allows you to open any folder inside (or mounted into) a container and take advantage of Visual Studio Code's full feature set.

image

Requirements

  • Docker installed locally.

  • VS Code Dev Containers Extension.

      Name: Dev Containers
      Id: ms-vscode-remote.remote-containers
      Description: Open any folder or repository inside a Docker container and take advantage of Visual Studio Code's full feature set.
      Version: 0.347.0
      Publisher: Microsoft
      VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers
    

For more, please refer to: System Requirements

Usage

If you meet the requirements, you can get started by just opening this project in VS Code, and selecting Open in Dev Container.

If the message above doesn't appear, you can press: ctrl + shift + p in your keyboard and type: dev container. Select: Open in Dev Container.

This should display a status bar like example below: image

After the container starts successfully, you are ready to use and add features to the code.

Create a Virtual Environment

Open the terminal and run the following commands:

  1. Create virtual environment

    python -m venv .venv
  2. Activate virtual environment

    • For Windows:

      .venv/Scripts/activate
    • For Mac/Linux:

      source .venv/bin/activate
  3. Install dependencies

    python -m pip install -r .devcontainer/requirements.txt

After the dependencies install successfully, you are ready to use and add features to the code.

Process

Download model from Hugging Face

Requirements

  • Hugging Face account.

    Can be created here.

  • Hugging Face API Token.

    Can be requested here.

  • Request access to Meta's Llama v2 models

    Can be requested here.

📝 NOTE: A Hugging Face Token is need to access gated models like Llama v2 and to authenticate with the Hugging Face CLI.

To download a model from Hugging Face, run the notebook Model Downloader.

The output of this notebook should the desired Hugging Face model inside the models directory.

image

Generating a GGUF file from Hugging Face Llama model weights

To convert the download Llama model from step above, run the notebook GGUF Converter.

This cell is responsible of converting the model to a GGUF compatible format.

image

The remaining part of the notebook will quantize the model (to reduce memory foot print - but may cause response quality degradation) and upload to Hugging Face.

Model Quantization

image

Upload to Hugging Face

image

image

Repo: kevinknights29/llama-2-7b-chat-q4_0.gguf

About

This project aims to create a guide for generating Llama.cpp GGUF model files out of base Llama v2 model weigths

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published