Name	Name	Last commit message	Last commit date
parent directory ..
src	src
Cargo.toml	Cargo.toml
README.md	README.md
llama-chat.wasm	llama-chat.wasm

Run the LLM via CLI

Run the LLM via CLI

Dependencies

Install the latest WasmEdge with plugins:

For macOS (apple silicon)

# install WasmEdge-0.13.4 with wasi-nn-ggml plugin
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

# Assuming you use zsh (the default shell on macOS), run the following command to activate the environment
source $HOME/.zshenv

For Ubuntu (>= 20.04)

# install libopenblas-dev
apt update && apt install -y libopenblas-dev

# install WasmEdge-0.13.4 with wasi-nn-ggml plugin
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

# Assuming you use bash (the default shell on Ubuntu), run the following command to activate the environment
source $HOME/.bashrc

For General Linux

# install WasmEdge-0.13.4 with wasi-nn-ggml plugin
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugin wasi_nn-ggml

# Assuming you use bash (the default shell on Ubuntu), run the following command to activate the environment
source $HOME/.bashrc

Get `llama-chat` wasm app

Download the llama-chat.wasm:

curl -LO https://github.com/second-state/llama-utils/raw/main/chat/llama-chat.wasm

Get Model

Click here to see the download link and commands to run the model.

Execute

Execute the WASM with the wasmedge using the named model feature to preload large model. Here we use the Llama-2-7B-Chat model as an example:

# download model
curl -LO https://huggingface.co/second-state/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf

# run the `llama-chat` wasm app with the model
wasmedge --dir .:. --nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf \
  llama-chat.wasm --prompt-template llama-2-chat

After executing the command, you may need to wait a moment for the input prompt to appear. You can enter your question once you see the [USER]: prompt:

[USER]:
What's the capital of France?
[ASSISTANT]:
The capital of France is Paris.
[USER]:
what about Norway?
[ASSISTANT]:
The capital of Norway is Oslo.
[USER]:
I have two apples, each costing 5 dollars. What is the total cost of these apples?
[ASSISTANT]:
The total cost of the two apples is 10 dollars.
[USER]:
What if I have 3 apples?
[ASSISTANT]:
If you have 3 apples, each costing 5 dollars, the total cost of the apples is 15 dollars.

CLI options

The options for llama-chat wasm app are:

~/workspace/llama-utils/chat$ wasmedge llama-chat.wasm -h
Usage: llama-chat.wasm [OPTIONS]

Options:
  -a, --model-alias <ALIAS>
          Model alias [default: default]
  -c, --ctx-size <CTX_SIZE>
          Size of the prompt context [default: 4096]
  -n, --n-predict <N_PRDICT>
          Number of tokens to predict [default: 1024]
  -g, --n-gpu-layers <N_GPU_LAYERS>
          Number of layers to run on the GPU [default: 100]
  -b, --batch-size <BATCH_SIZE>
          Batch size for prompt processing [default: 4096]
  -r, --reverse-prompt <REVERSE_PROMPT>
          Halt generation at PROMPT, return control.
  -s, --system-prompt <SYSTEM_PROMPT>
          System prompt message string [default: "[Default system message for the prompt template]"]
  -p, --prompt-template <TEMPLATE>
          Prompt template. [default: llama-2-chat] [possible values: llama-2-chat, codellama-instruct, mistral-instruct-v0.1, mistrallite, openchat, belle-llama-2-chat, vicuna-chat, chatml, baichuan-2, wizard-coder, zephyr, intel-neural, deepseek-chat, deepseek-coder]
      --log-prompts
          Print prompt strings to stdout
      --log-stat
          Print statistics to stdout
      --log-all
          Print all log information to stdout
      --stream-stdout
          Print the output to stdout in the streaming way
  -h, --help
          Print help

Optional: Build the `llama-chat` wasm app yourself

Run the following command:

cargo build --target wasm32-wasi --release

The llama-chat.wasm will be generated in the target/wasm32-wasi/release folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat

chat

README.md

Run the LLM via CLI

Dependencies

Get `llama-chat` wasm app

Get Model

Execute

CLI options

Optional: Build the `llama-chat` wasm app yourself

Files

chat

Directory actions

More options

Directory actions

More options

Latest commit

History

chat

Folders and files

parent directory

README.md

Run the LLM via CLI

Dependencies

Get llama-chat wasm app

Get Model

Execute

CLI options

Optional: Build the llama-chat wasm app yourself

Get `llama-chat` wasm app

Optional: Build the `llama-chat` wasm app yourself