Most agent write-ups assume you have a large model and an API budget. These notes start from the harder version: 8 GB of VRAM, a 9B model, a 4096-token context window, and a backend that has to carry more of the system than the model can.
OpenClaw is a local-first AI assistant and coding agent built under those limits. This repo is documentation-first, not a framework release. It is a set of short technical write-ups about what changed once the hardware budget stopped being hypothetical: the model had to become a replaceable component, the server had to own the loop, prompts had to shrink, transcripts had to stop growing forever, and the UI had to make long-running work legible without pretending the phone was the brain.
The audience is developers building with local models on consumer hardware, especially people without unlimited API spend or comfortable cloud GPUs.
The core OpenClaw pattern is simple. The server owns the agent loop, tools, session state, and approvals. The app stays thin. It renders streamed events, shows diffs, handles approvals, and reconnects after push. That separation ended up mattering more than any single prompt trick.
If you only want the shortest path through the series, start with these:
Those four cover the hardware ceiling, the prompt protocol failure, the memory model, and the backend-heavy architecture. The rest of the series fills in the control-flow, product, and UI decisions around those constraints.
-
The Universal Socket
How the model started looking like an adapter problem instead of a rewrite problem. -
8GB and a Dream
Why the RTX 2060 Super became the real architect of the system. -
Scout/Builder
Why one model with two narrow roles worked better than one model with one overloaded role. -
Working Memory vs. Rolling Transcript
What changed once the server stopped sending the model its whole life story every turn. -
Morning Briefing
A repeatable daily report pattern built around backend generation, push, and a thin native client. -
The qwen3 Footgun
How innocent-looking angle brackets turned into empty Ollama output. -
Thin Client, Fat Server
Why the app became a renderer and approval surface instead of a second brain. -
Copying Claude Code
The parts of coding-agent UX worth borrowing: task strips, checkpoints, diffs, and reconnectable sessions.
These pieces are grounded in the operational truth and project history documents from the build. Where older planning files disagreed with the shipped system, the newer truth documents won. Future-state ideas like a more formal universal socket or tiered model profiles are described as future-state, not as shipped features.
- llms.txt for AI-facing indexing