Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
clebert committed Oct 22, 2023
1 parent f4c452e commit ebf217f
Showing 1 changed file with 27 additions and 6 deletions.
33 changes: 27 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ Build and run `llama2-generator`:

```sh
zig build -Doptimize=ReleaseFast
```

```sh
./zig-out/bin/llama2-generator models/tinystories_15m --temperature 0 --verbose
```

Expand All @@ -33,34 +36,43 @@ Install `git-lfs` and clone the [Llama 2 7B](https://huggingface.co/meta-llama/L
```sh
# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
```

```sh
git clone https://huggingface.co/meta-llama/Llama-2-7b-hf
```

Install the necessary Python packages and convert the Hugging Face model:

```sh
pip3 install -r requirements.txt
```

```sh
python3 convert_hf_model.py /path/to/Llama-2-7b-hf models/llama2_7b_hf
```

Build and run `llama2-generator`:

```sh
zig build -Doptimize=ReleaseFast
```

```sh
./zig-out/bin/llama2-generator models/llama2_7b_hf \
--prompt "Once Upon a Time" \
--sequence_length 28 \
--temperature 0 \
--thread_count 8 \
--verbose
--prompt "Once Upon a Time" \
--sequence_length 28 \
--temperature 0 \
--thread_count 8 \
--verbose
```

The output on an Apple M1 Pro with 32 GB of memory:

```
Once Upon a Time in Hollywood is a 2019 American comedy-drama film written and directed by Quentin Tarantino
achieved: 3.482 tok/s
achieved: 3.749 tok/s
```

## Run Llama 2 7B Chat from Hugging Face
Expand All @@ -70,20 +82,29 @@ Install `git-lfs` and clone the [Llama 2 7B Chat](https://huggingface.co/meta-ll
```sh
# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
```

```sh
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
```

Install the necessary Python packages and convert the Hugging Face model:

```sh
pip3 install -r requirements.txt
```

```sh
python3 convert_hf_model.py /path/to/Llama-2-7b-chat-hf models/llama2_7b_chat_hf
```

Build and run `llama2-chat`:

```sh
zig build -Doptimize=ReleaseFast
```

```sh
./zig-out/bin/llama2-chat models/llama2_7b_chat_hf --temperature 0 --thread_count 8
```

Expand Down

0 comments on commit ebf217f

Please sign in to comment.