Skip to content

Commit

Permalink
Merge pull request rustformers#112 from katopz/main
Browse files Browse the repository at this point in the history
Add Docker file
  • Loading branch information
philpax authored Apr 6, 2023
2 parents 7cc4c06 + 97d9042 commit eea9fc7
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 9 deletions.
21 changes: 21 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Start with a rust alpine image
FROM rust:alpine3.17 as builder
# This is important, see https://github.com/rust-lang/docker-rust/issues/85
ENV RUSTFLAGS="-C target-feature=-crt-static"
# if needed, add additional dependencies here
RUN apk add --no-cache musl-dev
# set the workdir and copy the source into it
WORKDIR /app
COPY ./ /app
# do a release build
RUN cargo build --release --bin llama-cli
RUN strip target/release/llama-cli

# use a plain alpine image, the alpine version needs to match the builder
FROM alpine:3.17
# if needed, install additional dependencies here
RUN apk add --no-cache libgcc
# copy the binary into the final image
COPY --from=builder /app/target/release/llama-cli .
# set the binary as entrypoint
ENTRYPOINT ["/llama-cli"]
28 changes: 19 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,12 @@ kinds of sources.
After acquiring the weights, it is necessary to convert them into a format that
is compatible with ggml. To achieve this, follow the steps outlined below:

> **Warning**
>
> **Warning**
>
> To run the Python scripts, a Python version of 3.9 or 3.10 is required. 3.11
> is unsupported at the time of writing.

``` shell
```shell
# Convert the model to f16 ggml format
python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1

Expand All @@ -95,7 +94,7 @@ python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1
```

> **Note**
>
>
> The [llama.cpp repository](https://github.com/ggerganov/llama.cpp) has
> additional information on how to obtain and run specific models. With some
> caveats:
Expand All @@ -104,17 +103,15 @@ python3 scripts/convert-pth-to-ggml.py /path/to/your/models/7B/ 1
> (versioned) ggml formats, but not the mmap-ready version that was [recently
> merged](https://github.com/ggerganov/llama.cpp/pull/613).

*Support for other open source models is currently planned. For models where
_Support for other open source models is currently planned. For models where
weights can be legally distributed, this section will be updated with scripts to
make the install process as user-friendly as possible. Due to the model's legal
requirements, this is currently not possible with LLaMA itself and a more
lengthy setup is required.*
lengthy setup is required._

- https://github.com/rustformers/llama-rs/pull/85
- https://github.com/rustformers/llama-rs/issues/75


### Running

For example, try the following prompt:
Expand Down Expand Up @@ -147,6 +144,19 @@ Some additional things to try:
A modern-ish C toolchain is required to compile `ggml`. A C++ toolchain
should not be necessary.

### Docker

```shell
# To build (This will take some time, go grab some coffee):
docker build -t llama-rs .

# To run with prompt:
docker run --rm --name llama-rs -it -v ${PWD}/data:/data -v ${PWD}/examples:/examples llama-rs -m data/gpt4all-lora-quantized-ggml.bin -p "Tell me how cool the Rust programming language is:"

# To run with prompt file and repl (will wait for user input):
docker run --rm --name llama-rs -it -v ${PWD}/data:/data -v ${PWD}/examples:/examples llama-rs -m data/gpt4all-lora-quantized-ggml.bin -f examples/alpaca_prompt.txt --repl
```

## Q&A

### Why did you do this?
Expand Down

0 comments on commit eea9fc7

Please sign in to comment.