Skip to content

Inference Llama 2 in one file of pure C

License

Notifications You must be signed in to change notification settings

Florents-Tselai/libllama2c

 
 

Repository files navigation

GitHub Repo stars

libllama2c

libllama2c is a fork of karpathy/llama2.c with the following improvements . Improvements on code written by Karpahahaha code 🤣.

  • A modified CLI that plays nicely with environment variables.
  • llama2c.h API to be used by other applications.
  • make install targets install a shared library libllama2c and a llama2c CLI in standard directories.

Usage

Start by cloning this repo and running make to build and run the tests.

git clone --recurse-submodules https://github.com/Florents-Tselai/libllama2c.git &&\
cd libllama2c &&\
make all test

CLI

You can, of course run the llama2c executable as expected

You can either provide arguments:

./llama2c ./models/stories15M.bin -i "Hello world" -z ./models/tokenizer.bin -n 100 -t 0.9

You can also set environment variables or even mix the two.

export LLAMA2C_MODEL_PATH=./models/stories15M.bin
export LLAMA2C_TOKENIZER_PATH=./models/tokenizer.bin
./llama2c -i "Hello world"

CLI reference

Usage:   llama2c <model> [options]
Example: llama2c -f path/to/model.bin -n 256 -i "Once upon a time"
Options (and corresponding environment variables):
  -f <path>   Path to model file. Env: LLAMA2C_MODEL_PATH
  -t <float>  Temperature in [0,inf], default 1.0. Env: LLAMA2C_TEMPERATURE
  -p <float>  P value in top-p (nucleus) sampling in [0,1], default 0.9. Env: LLAMA2C_TOPP
  -s <int>    Random seed, default time(NULL). Env: LLAMA2C_RNG_SEED
  -n <int>    Number of steps to run for, default 256. 0 = max_seq_len. Env: LLAMA2C_STEPS
  -i <string> Input prompt. Env: LLAMA2C_PROMPT
  -z <path>   Path to custom tokenizer. Env: LLAMA2C_TOKENIZER_PATH
  -m <string> Mode: generate|chat, default: generate. Env: LLAMA2C_MODE
  -y <string> (Optional) System prompt in chat mode. Env: LLAMA2C_SYSTEM_PROMPT

Library

#include llama2c.h

int main(int argc, char *argv[]) {
    Llama2cConfig config;
    config.model_path = "models/stories15M.bin";
    config.tokenizer_path = "models/tokenizer.bin";
    config.temperature = 1.0f;
    config.topp = 0.9f;
    config.steps = 256;
    config.prompt = NULL;
    config.rng_seed = 0;
    config.mode = "generate";
    config.system_prompt = NULL;
    
    config.prompt = "Hello world!";
    config.steps = 10;
    
    /* Generate */
    char *generated = llama2c_generate(config);
    printf("Generated text: %s\n", config.prompt);

    /* Encode */
    int *prompt_tokens = NULL;
    int num_prompt_tokens = 0;
    llama2c_encode(simple_config(), &prompt_tokens, &num_prompt_tokens);
}

Installation

You can install both the CLI and the library

PREFIX=/usr/local make all install

Although personally I prefer:

PREFIX=$HOME/.local make all install

To uninstall

PREFIX=$HOME/.local make uninstall

Motivation

He just wanted a decent LLM library to import in his C application ... Not too much to ask, was it? It was in 2024 when Florents stood on his keyboard looking for something good to read on his journey. His choice was limited to bloated overbloated .cpp packages and poor quality Python libraries leading to dependency hell. Flo's disappointment and subsequent anger at the range of libraries available led him to fork a repository.

About

Inference Llama 2 in one file of pure C

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 97.3%
  • Makefile 2.7%