Skip to content

🐑 Run LLM inference locally for various downstream applications.

Notifications You must be signed in to change notification settings

xxrjun/local-inference

Repository files navigation

Local Inference

Note

The commands in this guide are intended for Ubuntu Linux. If you are using a different platform (e.g., Windows or macOS), please refer to the official documentation of the tool for platform-specific instructions.

Tools

Backend

Further Reading: Ollama vs. vLLM: Choosing the Best Tool for AI Model Workflows

Frontend

Monitor

Tip

Check out prometheus_grafana for more details.

Others

Open Source Model Collections

Environment Setup

First, clone the repository:

git clone --recurse-submodules https://github.com/xxrjun/local-inference.git

Then, create a new Conda environment and install the required dependencies:

conda env -n local-inference python=3.12
conda activate local-inference

# Install Python dependencies
pip install -r requirements.txt

# Install Ollama on Linux
curl -fsSL https://ollama.com/install.sh | sh

Example Usage

It is recommended to use tmux to manage multiple sessions.

Ollama

tmux new -s ollama-serve
./examples/ollama_serve.sh
tmux new -s ollama-run
./examples/ollama_run.sh

vLLM

tmux new -s vllm-serve
./examples/vllm_serve.sh

Open Web UI

tmux new -s open-webui
./examples/open_webui.sh

Test OpenAI Compatible API

Copy .env.example to .env:

cp .env.example .env

Edit .env with the correct values, then run the test script:

python scripts/test_openai_client.py

If the API is working correctly, the output should resemble the following:

ChatCompletionMessage(content='Hello! How can I help you today? If you have any questions or need assistance, feel free to ask.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[], reasoning_content=None)

Downstream Applications

Refer to My Immersive Translate Setup Guide or Offical Docs

immersive_translate_demo

TTS (Text-to-Speech)

What is TTS?

Refer to the My TTS Setup Guide for more details.

chattts_demo

Releases

No releases published

Packages

No packages published