LocalLLMClient

A Swift package to interact with local Large Language Models (LLMs) on Apple platforms.

Demo / Multimodal

MobileVLM-3B (llama.cpp)	Qwen2.5 VL 3B (MLX)
llamacpp-mobilevlm.mov	mlx-qwen2.5.mov

iPhone 16 Pro

Example app

Important

This project is still experimental. The API is subject to change.

Features

Support for GGUF / MLX models / FoundationModels framework
Support for iOS and macOS
Streaming API
Multimodal (experimental)

Installation

Add the following dependency to your Package.swift file:

dependencies: [
    .package(url: "https://github.com/tattn/LocalLLMClient.git", branch: "main")
]

Usage

The API documentation is available here.

Basic Usage

Using with llama.cpp

import LocalLLMClient
import LocalLLMClientLlama
import LocalLLMClientUtility

// Download model from Hugging Face (Gemma 3)
let ggufName = "gemma-3-4B-it-QAT-Q4_0.gguf"
let downloader = FileDownloader(source: .huggingFace(
    id: "lmstudio-community/gemma-3-4B-it-qat-GGUF",
    globs: [ggufName]
))

try await downloader.download { print("Progress: \($0)") }

// Initialize a client with the downloaded model
let modelURL = downloader.destination.appending(component: ggufName)
let client = try await LocalLLMClient.llama(url: modelURL, parameter: .init(
    context: 4096,      // Context size
    temperature: 0.7,   // Randomness (0.0〜1.0)
    topK: 40,           // Top-K sampling
    topP: 0.9,          // Top-P (nucleus) sampling
    options: .init(responseFormat: .json) // Response format
))

let prompt = """
Create the beginning of a synopsis for an epic story with a cat as the main character.
Format it in JSON, as shown below.
{
    "title": "<title>",
    "content": "<content>",
}
"""

// Generate text
let input = LLMInput.chat([
    .system("You are a helpful assistant."),
    .user(prompt)
])

for try await text in try await client.textStream(from: input) {
    print(text, terminator: "")
}

Using with Apple MLX

import LocalLLMClient
import LocalLLMClientMLX
import LocalLLMClientUtility

// Download model from Hugging Face
let downloader = FileDownloader(
    source: .huggingFace(id: "mlx-community/Qwen3-1.7B-4bit", globs: .mlx)
)
try await downloader.download { print("Progress: \($0)") }

// Initialize a client with the downloaded model
let client = try await LocalLLMClient.mlx(url: downloader.destination, parameter: .init(
    temperature: 0.7,    // Randomness (0.0 to 1.0)
    topP: 0.9            // Top-P (nucleus) sampling
))

// Generate text
let input = LLMInput.chat([
    .system("You are a helpful assistant."),
    .user("Tell me a story about a cat.")
])

for try await text in try await client.textStream(from: input) {
    print(text, terminator: "")
}

Using with Apple FoundationModels

import LocalLLMClient
import LocalLLMClientFoundationModels

// Available on iOS 26.0+ / macOS 26.0+ and requires Apple Intelligence 
let client = try await LocalLLMClient.foundationModels(
    // Use system's default model
    model: .default,
    // Configure generation options
    parameter: .init(
        temperature: 0.7,
    )
)

// Generate text
let input = LLMInput.chat([
    .system("You are a helpful assistant."),
    .user("Tell me a short story about a clever fox.")
])

for try await text in try await client.textStream(from: input) {
    print(text, terminator: "")
}

Multimodal for Image

LocalLLMClient supports multimodal models like LLaVA for processing images along with text prompts.

Using with llama.cpp

import LocalLLMClient
import LocalLLMClientLlama
import LocalLLMClientUtility

// Download model from Hugging Face (Gemma 3)
let model = "gemma-3-4b-it-Q8_0.gguf"
let mmproj = "mmproj-model-f16.gguf"

let downloader = FileDownloader(
    source: .huggingFace(id: "ggml-org/gemma-3-4b-it-GGUF", globs: [model, mmproj]),
)
try await downloader.download { print("Download: \($0)") }

// Initialize a client with the downloaded model
let client = try await LocalLLMClient.llama(
    url: downloader.destination.appending(component: model),
    mmprojURL: downloader.destination.appending(component: mmproj)
)

let input = LLMInput.chat([
    .user("What's in this image?", attachments: [.image(.init(resource: .yourImage))]),
])

// Generate text without streaming
print(try await client.generateText(from: input))

Using with Apple MLX

import LocalLLMClient
import LocalLLMClientMLX
import LocalLLMClientUtility

// Download model from Hugging Face (Qwen2.5 VL)
let downloader = FileDownloader(source: .huggingFace(
    id: "mlx-community/Qwen2.5-VL-3B-Instruct-abliterated-4bit",
    globs: .mlx
))
try await downloader.download { print("Progress: \($0)") }

let client = try await LocalLLMClient.mlx(url: downloader.destination)

let input = LLMInput.chat([
    .user("What's in this image?", attachments: [.image(.init(resource: .yourImage))]),
])

// Generate text without streaming
print(try await client.generateText(from: input))

Utility

FileDownloader: A utility to download models with progress tracking.

CLI tool

You can use LocalLLMClient directly from the terminal using the command line tool:

# Run using llama.cpp
swift run localllm --model /path/to/your/model.gguf "Your prompt here"

# Run using MLX
./scripts/run_mlx.sh --model https://huggingface.co/mlx-community/Qwen3-1.7B-4bit "Your prompt here"

Tested models

LLaMA 3
Gemma 3 / 2
Qwen 3 / 2
Phi 4

Models compatible with llama.cpp backend
Models compatible with MLX backend

If you have a model that works, please open an issue or PR to add it to the list.

Requirements

iOS 16.0+ / macOS 14.0+
Xcode 16.0+

Acknowledgements

This package uses llama.cpp and Apple's MLX for model inference.

Support this project ❤️

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
.github		.github
.swiftpm/xcode/package.xcworkspace		.swiftpm/xcode/package.xcworkspace
Example		Example
Sources		Sources
Tests		Tests
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

LocalLLMClient

Features

Installation

Usage

Basic Usage

Multimodal for Image

Utility

CLI tool

Tested models

Requirements

Acknowledgements

About

Uh oh!

Releases 5

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Uh oh!

License

tattn/LocalLLMClient

Folders and files

Latest commit

History

Repository files navigation

LocalLLMClient

Features

Installation

Usage

Basic Usage

Multimodal for Image

Utility

CLI tool

Tested models

Requirements

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages