Skip to content

sureshsankaran/android_llm_mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

27 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AndroidMCP

On-Device LLM + Model Context Protocol for Android

An open-source Android application that combines on-device LLM inference with MCP (Model Context Protocol) support for extensible AI tools.

๐ŸŽฏ Vision

Build a private, offline-capable AI assistant for Android that:

  • Runs LLMs entirely on-device using llama.cpp
  • Supports MCP protocol for extensible tools
  • Includes built-in productivity tools (notes, calendar, etc.)
  • Allows third-party MCP servers via Android IPC

๐Ÿ’ก Key Use Cases

1. The "Deep Privacy" Journal & Therapist

The Pitch: "The only AI that knows your deepest thoughts, but tells no one."

  • The Problem: People want AI insight into their mental health or personal life but are terrified of training the next GPT model with their private journals.
  • The Use Case: You document anxiety, relationship struggles, or unfiltered opinions.
  • The Query: "Have I been feeling more anxious lately?"
  • The Magic: The app uses vector search to scan months of entries and synthesizes an answer: "You tend to express higher anxiety on Sunday nights, specifically regarding work deadlines, a pattern visible since October."
  • Why it wins: Zero data exposure risk. It's a "Safe Space" in your pocket.

2. The "Messy Thinker's" Savior (Capture Now, Organize Never)

The Pitch: "Stop organizing. Just dump it here."

  • The Problem: Notes apps are where ideas go to die because organizing them is friction.
  • The Use Case: You dump raw, unstructured inputโ€”voice memos, screenshots, half-baked ideas, random URLsโ€”into the app. No folders, no tags.
  • The Query: "What was that idea I had about a coffee shop app?"
  • The Magic: Vector search finds the semantic match in a voice note from 3 months ago, a screenshot from last week, and a text note from today, synthesizing them into a coherent project brief.
  • Why it wins: The AI is the organization layer.

3. The "Secure" Corporate Edge

The Pitch: "Bring your own AI to workโ€”without getting fired by IT."

  • The Problem: Employees want AI help but are banned from pasting internal docs or proprietary code into ChatGPT.
  • The Use Case: You load confidential internal PDFs, strategy docs, and proprietary code snippets into the app.
  • The Query: "Summarize the Q3 risks from these 5 confidential reports."
  • Why it wins: It bridges the gap: AI power, zero data leak. It's "Shadow IT" that is actually secure.

๐Ÿ“š Documentation

See the specification documents for detailed architecture:

Document Description
00_PROJECT_OVERVIEW.md Project overview and architecture
01_UI_LAYER_SPEC.md Jetpack Compose UI layer
02_AGENT_ORCHESTRATION_SPEC.md Agent/orchestration layer
03_MCP_HOST_CLIENT_SPEC.md MCP protocol implementation
04_ON_DEVICE_LLM_SPEC.md llama.cpp integration
05_BUILTIN_MCP_SERVERS_SPEC.md Notes and other built-in tools

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              UI Layer (Compose)                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚         Agent/Orchestration Layer               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   MCP Host/Client  โ”‚   On-Device LLM (llama.cpp)โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚          Built-in MCP Servers (Notes, etc.)     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”Œ MCP Server Transport Strategy

Different MCP servers use different transport mechanisms based on trust level and crash isolation requirements:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    AndroidMCP App                            โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚              MCP Host / Client                          โ”‚โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚โ”‚
โ”‚  โ”‚  โ”‚  In-Process     โ”‚   โ”‚   stdio Transport             โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚  Transport      โ”‚   โ”‚                               โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”‚   โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚  โ€ข pkb_*        โ”‚   โ”‚  โ”‚ Browser MCP Server      โ”‚ โ”‚โ”‚โ”‚  โ† Separate process
โ”‚  โ”‚  โ”‚  โ€ข calendar     โ”‚   โ”‚  โ”‚ (WebView wrapper)       โ”‚ โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚  โ€ข contacts     โ”‚   โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚  (built-in,     โ”‚   โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ”‚   trusted)      โ”‚   โ”‚  โ”‚ 3rd-party servers       โ”‚ โ”‚โ”‚โ”‚  โ† Separate process
โ”‚  โ”‚  โ”‚                 โ”‚   โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚โ”‚โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Server Type Transport Rationale
Built-in (PKB, Calendar) In-Process Trusted code, performance critical
Browser Automation stdio Crash isolation, memory isolation, WebView in separate process
3rd-party MCP Servers stdio / Unix Sockets Untrusted code, MUST be isolated

Why stdio for Browser/External servers?

  • ๐Ÿ›ก๏ธ Crash Isolation - Browser/tool crashes don't kill the app
  • ๐Ÿง  Memory Isolation - Heavy operations get separate memory budget
  • ๐Ÿ”Œ Standard Protocol - Compatible with existing MCP servers
  • ๐Ÿ“ฑ Android Constraint - WebView already runs in separate process

๐Ÿ› ๏ธ Build Setup

Prerequisites

  • Android Studio Hedgehog (2023.1.1) or later
  • JDK 17 or later (bundled with Android Studio)
  • Android SDK with API level 34
  • Android NDK 26.1.10909125 (will be downloaded automatically)
  • CMake 3.22.1 (will be downloaded automatically)

Local Development (Recommended for Apple Silicon)

For the best development experience on macOS (especially Apple Silicon M1/M2/M3), run builds directly using Android Studio's bundled JDK:

# Set JAVA_HOME to Android Studio's bundled JDK (add to ~/.zshrc for persistence)
export JAVA_HOME="/Applications/Android Studio.app/Contents/jbr/Contents/Home"

# Verify SDK is found
echo $ANDROID_HOME  # Should show ~/Library/Android/sdk

# Build debug APK
./gradlew assembleDebug

# Run unit tests (~2-3 minutes on Apple Silicon)
./gradlew testDebugUnitTest

# Build release APK
./gradlew assembleRelease

Why local over Docker? On Apple Silicon (M1/M2/M3), Docker runs x86_64 images via QEMU emulation, making builds 5-10x slower. Native builds are significantly faster.

Android Studio Development (GUI)

  1. Open Project:

    # Open via terminal
    open -a "Android Studio" /path/to/android_llm_mcp
    
    # Or: File โ†’ Open โ†’ Select the project folder
  2. Wait for Gradle Sync:

    • Android Studio will automatically sync Gradle dependencies (2-3 min first time)
    • NDK and CMake will be downloaded automatically if missing
  3. Build the App:

    • Menu: Build โ†’ Make Project (or Cmd+F9)
    • First build takes ~7 minutes (includes native llama.cpp compilation)
    • Subsequent builds are much faster (~30 seconds)
  4. Run on Device/Emulator:

    • Select a device from the toolbar dropdown
    • Click the green โ–ถ๏ธ Run button (or Ctrl+R)
    • For LLM testing, use a physical device (recommended) or ARM64 emulator
  5. Run Unit Tests:

    • Right-click on app/src/test โ†’ Run 'Tests in app'
    • Or: Menu Run โ†’ Run 'All Tests'
    • Expected: 740 tests passing

Troubleshooting Android Studio

Issue Solution
"SDK not found" File โ†’ Project Structure โ†’ SDK Location โ†’ Set to ~/Library/Android/sdk
"NDK not found" Wait for auto-download, or: SDK Manager โ†’ SDK Tools โ†’ NDK
"CMake not found" Wait for auto-download, or: SDK Manager โ†’ SDK Tools โ†’ CMake
Submodule errors Run git submodule update --init --recursive in terminal
Gradle sync failed File โ†’ Invalidate Caches โ†’ Restart

Docker Development (Optional)

If you prefer Docker or need a consistent CI environment:

# Build the Docker image
docker build -t android-mcp-builder .

# Run tests via Docker (slower on Apple Silicon due to emulation)
docker run --rm -v "$PWD/app:/app/app" android-mcp-builder ./gradlew testDebugUnitTest --no-daemon

# Build APK via Docker
docker run --rm -v "$PWD/app:/app/app" android-mcp-builder ./gradlew assembleDebug --no-daemon

llama.cpp Submodule Setup

This project uses llama.cpp (pinned to release b4380) as a Git submodule for on-device LLM inference.

# Clone with submodules
git clone --recursive https://github.com/sureshsankaran/android_llm_mcp.git

# Or if already cloned, initialize submodules
git submodule update --init --recursive

# Verify llama.cpp is at the correct version
cd app/src/main/cpp/llama.cpp
git describe --tags  # Should show b4380

Note: The build will fail with a clear error message if the submodule is not initialized.

Building the Project

# Build debug APK
./gradlew assembleDebug

# Build release APK
./gradlew assembleRelease

# Run unit tests
./gradlew test

# Run instrumented tests (requires device/emulator)
./gradlew connectedAndroidTest

Native Build Configuration

The project builds llama.cpp natively for:

  • arm64-v8a (ARM64, primary target)
  • armeabi-v7a (ARM32, fallback)

Key build features:

  • NEON SIMD optimizations enabled for ARM
  • CPU-only inference (GPU backends disabled for broader compatibility)
  • c++_shared STL for better compatibility
  • Memory-mapped models for efficient loading

Model Setup for Testing

For running integration tests, place a GGUF model file in one of these locations:

# Internal storage
adb push your_model.gguf /data/data/com.androidmcp.debug/files/models/test_model.gguf

# External storage (if accessible)
adb push your_model.gguf /storage/emulated/0/Android/data/com.androidmcp.debug/files/test_model.gguf

Recommended test models (Q4_K_M quantization):

  • TinyLlama 1.1B (~700MB)
  • Qwen2.5-0.5B (~350MB)

๐Ÿ”ง Tech Stack

  • UI: Jetpack Compose + Material 3
  • LLM: llama.cpp via JNI
  • MCP: MCP Kotlin SDK + AIDL for cross-app
  • Storage: Room Database
  • DI: Hilt

๐Ÿ“ฑ Supported Models

Model Size RAM
Llama 3.2 1B ~1GB ~2GB
Llama 3.2 3B ~2GB ~4GB
Phi-3 Mini ~2.5GB ~4GB
Qwen2.5 3B ~2GB ~4GB

๐Ÿš€ Status

Phase: Specification/Design

  • Architecture design
  • Layer specifications
  • Project scaffolding
  • Core implementation
  • MVP release

๐Ÿ”ฎ Future Improvements

Area Investigation Rationale
Vector DB Keep androidx.sqlite + sqlite-vec Current approach is solid for on-device vector search
LLM Runtime Prototype MediaPipe LLM Inference May improve NPU utilization and battery life vs llama.cpp
AICore / Gemini Nano Add startup check for AICore availability If Gemini Nano is available, use it - solves "Model Delivery" and "Battery" risks instantly (no PAD download, optimized for device)

๐Ÿ“„ License

MIT License

๐Ÿ”— References

About

Android On-Device LLM + MCP Personal Knowledge Base App

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages