Skip to content

brownsn1-ux/ollama-coding-agent-notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Building OpenClaw

Most agent write-ups assume you have a large model and an API budget. These notes start from the harder version: 8 GB of VRAM, a 9B model, a 4096-token context window, and a backend that has to carry more of the system than the model can.

OpenClaw is a local-first AI assistant and coding agent built under those limits. This repo is documentation-first, not a framework release. It is a set of short technical write-ups about what changed once the hardware budget stopped being hypothetical: the model had to become a replaceable component, the server had to own the loop, prompts had to shrink, transcripts had to stop growing forever, and the UI had to make long-running work legible without pretending the phone was the brain.

The audience is developers building with local models on consumer hardware, especially people without unlimited API spend or comfortable cloud GPUs.

The core OpenClaw pattern is simple. The server owns the agent loop, tools, session state, and approvals. The app stays thin. It renders streamed events, shows diffs, handles approvals, and reconnects after push. That separation ended up mattering more than any single prompt trick.

Start Here

If you only want the shortest path through the series, start with these:

Those four cover the hardware ceiling, the prompt protocol failure, the memory model, and the backend-heavy architecture. The rest of the series fills in the control-flow, product, and UI decisions around those constraints.

Series

  • The Universal Socket
    How the model started looking like an adapter problem instead of a rewrite problem.

  • 8GB and a Dream
    Why the RTX 2060 Super became the real architect of the system.

  • Scout/Builder
    Why one model with two narrow roles worked better than one model with one overloaded role.

  • Working Memory vs. Rolling Transcript
    What changed once the server stopped sending the model its whole life story every turn.

  • Morning Briefing
    A repeatable daily report pattern built around backend generation, push, and a thin native client.

  • The qwen3 Footgun
    How innocent-looking angle brackets turned into empty Ollama output.

  • Thin Client, Fat Server
    Why the app became a renderer and approval surface instead of a second brain.

  • Copying Claude Code
    The parts of coding-agent UX worth borrowing: task strips, checkpoints, diffs, and reconnectable sessions.

Scope

These pieces are grounded in the operational truth and project history documents from the build. Where older planning files disagreed with the shipped system, the newer truth documents won. Future-state ideas like a more formal universal socket or tiered model profiles are described as future-state, not as shipped features.

Supporting Files

About

Technical notes on building a local-first AI coding assistant with local LLMs, Ollama, SwiftUI, and consumer GPU constraints.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors