Talk to your ESP32. Let your PC think.
Voice-first AI assistant for Arduino makers β speak to an Atom Echo, get intelligent responses powered by local or cloud LLMs.
Quick Start Β· Features Β· Docs Β· QUICKSTART.md
ccoli turns an M5Stack Atom Echo (ESP32) into a voice assistant powered by your PC. You speak β the device sends audio over USB or Wi-Fi β your PC handles speech recognition, LLM reasoning, and text-to-speech β the device plays back the response.
No cloud required. Runs with local Ollama out of the box, or connect to Gemini / Claude / ChatGPT.
demo_combined.mp4
| Component | Why | |
|---|---|---|
| π₯οΈ | PC (Windows / Mac / Linux) | Runs the ccoli server (STT + LLM + TTS) |
| π€ | M5Stack Atom Echo (ESP32) | Captures your voice & plays responses |
| π | USB-C cable | Default wired mode, auto-detected by the server |
| πΆ | Same Wi-Fi network | Optional wireless mode |
Official Atom Echo links:
- Product docs: M5Stack Atom Echo
- Official store: ATOM Echo Smart Speaker Development Kit
curl -fsSL https://raw.githubusercontent.com/surrealier/LLM_Aduino/main/scripts/install.sh | bashThe bootstrap script installs the lightweight CLI first, then hands off the rest to ccoli setup.
The setup wizard asks whether you want:
Ollama Localfor on-device local modelsCloud APIfor Gemini / Claude / ChatGPTConfigure Laterif you only want the runtime installed first
During the same ccoli setup flow you can also choose:
wiredfor USB serial firmware defaultswifito write Wi-Fi credentials andSERVER_IPintodevice_secrets.h
It keeps the base install lightweight, then installs only the runtime extras this project actually uses. The default runtime no longer pulls torch or transformers.
If you want to rerun onboarding later:
ccoli setupIf startup later says the web dashboard dependency is missing, reinstall the runtime extras from the repo root:
python3 -m pip install -e .[runtime]Local repo / development path:
./scripts/install.sh
# or
python3 scripts/install.pyOpen arduino/atom_echo_m5stack_esp32_ino/atom_echo_m5stack_esp32_ino.ino in Arduino IDE and upload to your Atom Echo.
No extra setup is required for default USB wired mode.
- If
device_secrets.his missing, the firmware boots in wired mode automatically ccoli startauto-detects the Atom Echo over USB serial- Default wired serial speed is
115200for broad CP210x stability on macOS - Wired USB audio uses
8kHz G.711 mu-lawin both directions so mic capture and TTS both fit inside the wired bandwidth budget - Arduino IDE upload speed can stay at
115200; flashing speed and runtime protocol settings are still separate - On the first connection, the server waits for an ESP32
PING/PONGhandshake before sending the welcome TTS, then brackets playback withMIC_LOCK/MIC_UNLOCK
Optional robot/display mode:
- Install Arduino libraries
Adafruit SSD1306andAdafruit GFX Library - Connect an external SSD1306 OLED to
G25(SDA) andG21(SCL)
First-Time Arduino IDE Setup (ESP32 + Atom Echo)
If this is your first ESP32 project, use this order:
- Install Arduino IDE 2.x from the official Arduino site.
- Open
Arduino IDE -> Settingsand add this Board Manager URL:https://static-cdn.m5stack.com/resource/arduino/package_m5stack_index.json - Open
Boards Managerand installesp32byEspressif Systems. - Still in
Boards Manager, install theM5Stackboard package so the Atom board profiles appear cleanly in Arduino IDE. - Open
Library Managerand install:M5UnifiedESP32ServoAdafruit SSD1306by AdafruitAdafruit GFX Libraryby Adafruit
- When Arduino IDE asks to install dependent libraries for
M5Unified, chooseInstall All. - Connect the Atom Echo with a USB-C data cable, then check
Tools -> Portand confirm a serial device appears. - In
Tools -> Board, select an Atom-compatible target. For the original Atom Echo, start withM5Atom. - Open
arduino/atom_echo_m5stack_esp32_ino/atom_echo_m5stack_esp32_ino.ino, compile once, then upload. - If the board is not recognized:
- Reconnect with a known data-capable USB cable
- Try another USB port
- Restart Arduino IDE after board/library installation
- Install the official Silicon Labs driver: CP210x USB to UART Bridge VCP Drivers
- If you only want default wired mode, you can upload without creating
device_secrets.h. - If you want Wi-Fi mode later, run
ccoli config wifi ...and then setSERVER_IPinarduino/atom_echo_m5stack_esp32_ino/device_secrets.h.
Notes:
M5Unifiedis the key firmware dependency for the current sketch.ESP32Servois currently included by the firmware, so install it even if you are not using robot mode yet.Adafruit SSD1306andAdafruit GFX Libraryare only needed for the optional external OLED display flow.
ccoli startThen connect the Atom Echo to your PC with USB-C.
- The server preloads STT and TTS once during startup so the first spoken turn does not pay the full model warmup cost.
- When the web dashboard is enabled, startup logs print the dashboard URL(s) and the
/api/docslink. - LED status: red while waiting for the server link, light green when the device is connected and ready.
- On the first healthy
PING/PONGhandshake, ccoli speaks a short time-of-day welcome line without calling the LLM, so startup greetings cannot be sent before the device is ready or truncated by model output limits. - On macOS, the current STT path uses
faster-whisper, so STT stays oncpurather than AppleMPS. The default TTS backendedge_ttsalso does not use local MPS/GPU acceleration.
ccoli setup
# choose `wifi` during onboarding
# or update later with:
ccoli config wifi MyHomeWiFi password MySecretPass port 5001Then set SERVER_IP in arduino/atom_echo_m5stack_esp32_ino/device_secrets.h to your PC's local IP and upload again.
If you enabled the Telegram channel in server/.env, you can start chatting from Telegram as well:
- Open
@BotFatherand confirm your bot'susername - Search that
usernamein Telegram - Open the bot chat and tap
Start, or send/start - Send a normal message such as
μλ - If the server is running, ccoli replies through the bot
Tips:
- If you changed
server/.envafter starting the server, restartccoli start - If
TELEGRAM_ALLOWED_CHAT_IDSis blank, the first chat is not blocked by the allow-list - Full setup guide:
docs/TELEGRAM_CHANNEL_GUIDE.md
π That's it β speak to the Atom Echo and hear the response!
flowchart LR
U["π£οΈ You"] --> A["π€ Atom Echo"]
A -->|Audio over USB / Wi-Fi| S["π₯οΈ ccoli server"]
S -->|Text| L["π§ LLM\nOllama / Gemini / Claude / ChatGPT"]
L -->|Response| S
S -->|TTS audio| A
- π£οΈ Voice-first β speak naturally, get voice responses
- π§ Multi-LLM β Ollama (local, default), Gemini, Claude, ChatGPT
- π§ Runtime priority routing β resolves model, network, and processor candidates in priority order, then keeps the selected LLM route until config or priority is reloaded
- π Integrations β weather, calendar, search, maps, notifications
- ποΈ Voice ID β speaker recognition to personalize responses
- π€ Robot mode (coming soon) β servo/display control via voice
- π³ Docker tests β reproducible test suite out of the box
ccoli already has a terminal-first onboarding flow with Rich panels and tables. It is not a full-screen ncurses app, but it behaves like a lightweight TUI for install/setup tasks.
Typical flow:
- Run
ccoli setup - Pick your AI path:
Ollama Local,Cloud API, orConfigure Later - If needed, pick the provider and model
- Pick the STT device (
cpuon macOS by default) - Pick the device connection mode (
wiredorwifi) - If needed, enter Wi-Fi SSID/password and the server IP
- Review the generated setup plan in the terminal
- Confirm, then start with
ccoli start
Example session:
$ ccoli setup
ββ ccoli Setup ββββββββββββββββββββββββββββββββββββββ
β Choose how ccoli should install and configure β
β your AI runtime. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
Choose your AI path
1. ollama - Local model on your machine via Ollama
2. api - Gemini / Claude / ChatGPT via API key
3. manual - Install runtime now and configure later
Select [1]: 2
Choose your cloud provider
1. gemini - Google Gemini
2. claude - Anthropic Claude
3. chatgpt - OpenAI ChatGPT
Select [1]: 1
Model name [gemini-2.5-flash]:
Choose STT device
1. cpu - Best default for macOS and general compatibility
2. cuda - Use NVIDIA CUDA when available
Select [1]: 1
Choose device connection
1. wired - USB serial, no Wi-Fi credentials required
2. wifi - Write Wi-Fi credentials and server IP to device_secrets.h
Select [1]: 1
Setup Plan
- Install target: api
- Provider: gemini
- Model: gemini-2.5-flash
- STT device: cpu
- Device connection: wired
- Server port: 5001
- Python extras: runtime
Continue with this setup plan? [Y/n]: y
$ ccoli start
...
Runtime warmup: stt_ready=True tts_ready=True
Web dashboard: http://localhost:8005
Web API docs: http://localhost:8005/api/docs
Useful terminal commands after setup:
ccoli config integration listccoli config voice-id statusccoli start --port 5002
LLM Provider
Default is Ollama (local, no API key). Switch anytime:
ccoli setup
ccoli config llm --provider ollama --model qwen3:8b
ccoli config llm --provider gemini --model gemini-2.5-flash --api-key <GEMINI_API_KEY>
ccoli config llm --provider claude --model claude-3-5-haiku-latest --api-key <ANTHROPIC_API_KEY>
ccoli config llm --provider chatgpt --model gpt-4o-mini --api-key <OPENAI_API_KEY>API key λ°κΈ:
| Provider | Get API Key |
|---|---|
| Gemini | Google AI Studio |
| Claude | Anthropic Console |
| ChatGPT | OpenAI Platform |
Ollama is auto-installed and auto-started if missing.
Runtime Priority
ccoli now keeps runtime priority as first-class config:
llm:
priority: [ollama, api, ollama_cpu, other]
api_priority: [gemini, claude, chatgpt]
connection:
priority: [wired, wifi]
runtime:
processor_priority: [gpu, cpu]You can change the same priorities during a conversation or in the web chat:
@@μ°μ μμ μν
λͺ¨λΈ μ°μ μμ ollama > api > ollama cpu > other
api μ°μ μμ gemini > claude > chatgpt
μ°κ²° μ°μ μμ wired > wifi
νλ‘μΈμ μ°μ μμ gpu > cpu
When connection.mode is auto, the server keeps checking both Wired and WiFi live and binds to the first healthy link that appears while still honoring the current priority order.
LLM priority is resolved on the first LLM request after startup or priority/config reload. If a higher-priority route such as local Ollama is unavailable and Gemini succeeds, later turns go directly to Gemini instead of rechecking Ollama on every message.
For voice latency and stable TTS, LLM thinking is disabled in runtime calls. Gemini requests send thinkingBudget: 0, and regular agent responses use a larger output budget to avoid short Korean replies being cut mid-sentence.
ollama_cpu is a distinct fallback bucket in runtime policy, but a single shared Ollama server cannot be forced to switch GPU/CPU per request. To make that bucket physically separate, point it at a dedicated CPU-only local Ollama instance.
On macOS, GPU priority can still matter for local LLM routing, but the current STT/TTS stack does not run on Apple MPS.
Integrations
ccoli config integration list # see all integrations
ccoli config integration set weather --api-key <KEY> # configure
ccoli config integration enable weather # enable
ccoli config integration test weather # verifyGoogle Calendar example:
ccoli config integration set calendar-google \
--client-id <ID> --client-secret <SECRET> --refresh-token <TOKEN>
ccoli config integration test calendar-googleGoogle Calendar uses OAuth credentials, not a simple API key. See docs/GOOGLE_CALENDAR_GUIDE.md for where to enable the API and how to get a refresh token.
Missing keys? The test command tells you exactly what to set.
Integration API key λ°κΈ:
| Integration | Get API Key |
|---|---|
| Weather | OpenWeatherMap |
| Search | Tavily |
| Maps | Google Maps Platform |
| Calendar (Google) | Google Cloud Console |
| Notify (Slack) | Slack API β Your Apps |
Voice ID
ccoli config voice-id enable
ccoli config voice-id threshold --value 0.72
ccoli config voice-id statusOr control via voice at runtime:
@@<USERNAME> register voice
@@enable voice recognition
| Command | Description |
|---|---|
ccoli setup |
Interactive installer / onboarding wizard |
ccoli start |
Start the server |
ccoli start --port 5002 |
Start with port override |
ccoli config wifi <SSID> password <PASS> port <PORT> |
Configure optional Wi-Fi mode |
ccoli config llm --provider <name> [--model <m>] [--api-key <k>] |
Set LLM provider |
ccoli config integration <list|set|enable|disable|test> |
Manage integrations |
ccoli config voice-id <status|enable|disable|delete|threshold> |
Manage Voice ID |
# Docker (recommended)
docker compose -f docker/docker-compose.test.yml up --build --abort-on-container-exit --exit-code-from server-test
# or use the helper script
./scripts/run_docker_tests.shCI runs the same suite on every PR via GitHub Actions (.github/workflows/docker-tests.yml).
ccoli/
βββ arduino/ # Atom Echo ESP32 firmware
βββ ccoli/ # CLI entry point
βββ server/ # Python server (STT / LLM / TTS)
β βββ server.py
β βββ config.yaml
β βββ src/
βββ docs/ # API, protocol, PRD docs
βββ docker/ # Docker Compose for tests & mocks
βββ scripts/ # Helper scripts
| Doc | What's inside |
|---|---|
| QUICKSTART.md | Quick onboarding guide |
| docs/API.md | Server module map |
| docs/PROTOCOL.md | Binary protocol spec |
| docs/PRD.md | Product requirements |
- Never commit real credentials β use
device_secrets.honly for optional Wi-Fi mode (git-ignored) - Server secrets go in
server/.env(seeserver/env.example)
This project is licensed under the GNU Affero General Public License v3.0.
Open http://localhost:8005 for the multilingual dashboard with English as the default UI, optional νκ΅μ΄ / ζ₯ζ¬θͺ / δΈζ switching, a diagnostics-first runtime view, editable memory/schedules/chat, and live logs.
