Command-line interface for Kimari Local AI
LLM inference gateway for consumer GPUs
npm install -g @kimari-ai/cliThen set up the Python backend:
git clone https://github.com/smouj/kimari-local-ai.git ~/.kimari/kimari-local-ai
cd ~/.kimari/kimari-local-ai
pip install -e .kimari doctor # System diagnostics
kimari server start # Start inference server
kimari models download --model Qwen3-4B # Download a model
kimari chat --profile default # Start chatting
kimari dashboard # Open web UI| Command | Description |
|---|---|
kimari doctor |
Full system diagnostics |
kimari server start |
Start llama.cpp inference server |
kimari server stop |
Stop server gracefully |
kimari server status |
Server health and uptime |
kimari models list |
List downloaded GGUF models |
kimari models download |
Download a model from HuggingFace |
kimari chat |
Interactive chat or single prompt |
kimari benchmark |
Performance benchmarks |
kimari system |
Live resource monitor |
kimari profiles |
Manage GPU profiles |
kimari config |
View/edit configuration |
kimari dashboard |
Open web dashboard |
npm (kimari) → Node.js wrapper → Python CLI → llama.cpp → GPU
The CLI is a lightweight Node.js wrapper delegating to the Python backend. npm handles distribution and updates; Python handles GPU inference and model management.
- Linux (Ubuntu 20.04+, Debian 11+)
- NVIDIA GPU with CUDA 12+
- Python 3.10+ · Node.js 18+
- Kimari Local AI — Main repo
- Kimari Dashboard — Web UI
Apache-2.0 © Kimari AI