Name	Name	Last commit message	Last commit date
Latest commit History 621 Commits
.github/workflows	.github/workflows
.vscode	.vscode
binaries	binaries
crates	crates
doc	doc
examples	examples
util	util
.gitignore	.gitignore
.gitmodules	.gitmodules
Cargo.lock	Cargo.lock
Cargo.toml	Cargo.toml
LICENSE-APACHE	LICENSE-APACHE
LICENSE-MIT	LICENSE-MIT
README.md	README.md

`llm` - Large Language Models for Everyone, in Rust

llm is an ecosystem of Rust libraries for working with large language models - its built on top of the fast, efficient GGML library for machine learning.

Image by @darthdeus, using Stable Diffusion

The primary entrypoint for developers is the llm crate, which wraps llm-base and the supported model crates.

For end-users, there is a CLI application, llm-cli, which provides a convenient interface for interacting with supported models. Text generation can be done as a one-off based on a prompt, or interactively, through REPL or chat modes. The CLI can also be used to serialize (print) decoded models, quantize GGML files, or compute the perplexity of a model. It can be downloaded from the latest GitHub release or by installing it from crates.io.

llm is powered by the ggml tensor library, and aims to bring the robustness and ease of use of Rust to the world of large language models. At present, inference is only on the CPU, but we hope to support GPU inference in the future through alternate backends.

Currently, the following models are supported:

BLOOM
GPT-2
GPT-J
GPT-NeoX: GPT-NeoX, StableLM, RedPajama, Dolly v2
LLaMA: LLaMA, Alpaca, Vicuna, Koala, GPT4All v1, GPT4-X, Wizard
MPT

Getting Started

This project depends on Rust v1.65.0 or above and a modern C toolchain.

The llm crate exports llm-base and the model crates (e.g. bloom, gpt2 llama).

To use llm, add it to your Cargo.toml:

[dependencies]
llm = "0.2"

NOTE: To improve debug performance, exclude llm from being built in debug mode:

[profile.dev.package.llm]
opt-level = 3

Building `llm-cli`

Follow these steps to build the command line application, which is named llm:

Using `cargo`

To install llm to your Cargo bin directory, which rustup is likely to have added to your PATH, run:

cargo install llm-cli

The CLI application can then be run through llm.

From Source

Clone the repository and then build it with

git clone --recurse-submodules git@github.com:rustformers/llm.git
cargo build --release

The resulting binary will be at target/release/llm[.exe].

It can also be run directly through Cargo, with

cargo run --release -- $ARGS

Getting Models

GGML files are easy to acquire. For a list of models that have been tested, see the known-good models.

Certain older GGML formats are not supported by this project, but the goal is to maintain feature parity with the upstream GGML project. For problems relating to loading models, or requesting support for supported GGML model types, please open an Issue.

From Hugging Face

Hugging Face 🤗 is a leader in open-source machine learning and hosts hundreds of GGML models. Search for GGML models on Hugging Face 🤗.

r/LocalLLaMA

This Reddit community maintains a wiki related to GGML models, including well organized lists of links for acquiring GGML models (mostly from Hugging Face 🤗).

Running

Once the llm executable has been built or is in a $PATH directory, try running it. Here's an example that uses the open-source GPT4All language model:

llm llama infer -m ggml-gpt4all-j-v1.3-groovy.bin -p "Rust is a cool programming language because"

For more information about the llm CLI, use the --help parameter.

There is also a simple inference example that is helpful for debugging:

cargo run --release --example inference llama ggml-gpt4all-j-v1.3-groovy.bin $OPTIONAL_PROMPT

Working with Raw Models

Python v3.9 or v3.10 is needed to convert a raw model to a GGML-compatible format (note that Python v3.11 is not supported):

python3 util/convert-pth-to-ggml.py $MODEL_HOME/$MODEL/7B/ 1

The output of the above command can be used by llm to create a quantized model:

cargo run --release llama quantize $MODEL_HOME/$MODEL/7B/ggml-model-f16.bin $MODEL_HOME/$MODEL/7B/ggml-model-q4_0.bin q4_0

In future, we hope to provide a more streamlined way of converting models.

Note

The llama.cpp repository has additional information on how to obtain and run specific models.

Q&A

Does the `llm` CLI support chat mode?

Yes, but certain fine-tuned models (e.g. Alpaca, Vicuna, Pygmalion) are more more suited to chat use-cases than so-called "base models". Here's an example of using the llm CLI in REPL (Read-Evaluate-Print Loop) mode with an Alpaca model - note that the provided prompt format is tailored to the model that is being used:

llm llama repl -m ggml-alpaca-7b-q4.bin -f examples/alpaca_prompt.txt

There is also a Vicuna chat example that demonstrates how to create a custom chatbot:

cargo run --release --example vicuna-chat llama ggml-vicuna-7b-q4.bin

Can `llm` sessions be persisted for later use?

Sessions can be loaded (--load-session) or saved (--save-session) to file. To automatically load and save the same session, use --persist-session. This can be used to cache prompts to reduce load time, too.

Do you provide support for Docker and NixOS?

The llm Dockerfile is in the util directory, as is a Flake manifest and lockfile.

Do you accept contributions?

Absolutely! Please see the contributing guide.

What applications and libraries use `llm`?

Applications

llmcord: Discord bot for generating messages using llm.
local.ai: Desktop app for hosting an inference API on your local machine using llm.

Libraries

llm-chain: Build chains in large language models for text summarization and completion of more complex tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`llm` - Large Language Models for Everyone, in Rust

Getting Started

Building `llm-cli`

Using `cargo`

From Source

Getting Models

From Hugging Face

r/LocalLLaMA

Running

Working with Raw Models

Q&A

Does the `llm` CLI support chat mode?

Can `llm` sessions be persisted for later use?

Do you provide support for Docker and NixOS?

Do you accept contributions?

What applications and libraries use `llm`?

Applications

Libraries

About

Releases

Packages

Languages

License

dchima/rust-llm

Folders and files

Latest commit

History

Repository files navigation

llm - Large Language Models for Everyone, in Rust

Getting Started

Building llm-cli

Using cargo

From Source

Getting Models

From Hugging Face

r/LocalLLaMA

Running

Working with Raw Models

Q&A

Does the llm CLI support chat mode?

Can llm sessions be persisted for later use?

Do you provide support for Docker and NixOS?

Do you accept contributions?

What applications and libraries use llm?

Applications

Libraries

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`llm` - Large Language Models for Everyone, in Rust

Building `llm-cli`

Using `cargo`

Does the `llm` CLI support chat mode?

Can `llm` sessions be persisted for later use?

What applications and libraries use `llm`?

Packages