Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions astro.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ export default defineConfig({
items: [
{ slug: "expanding-horizons/threads-context-and-caching" },
{ slug: "expanding-horizons/model-pricing" },
{ slug: "expanding-horizons/high-level-harnesses" },
{ slug: "expanding-horizons/what-to-read-next" },
],
},
Expand Down
134 changes: 134 additions & 0 deletions src/content/docs/expanding-horizons/high-level-harnesses.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
title: High-level harnesses
description: Beyond individual agent sessions — scheduled automations, parallel agent fleets, and the emerging pattern of AI-driven code pipelines.
---

import ExternalLink from "../../../components/ExternalLink.astro";

The [harness engineering](/becoming-productive/harness-engineering/) chapter covered shaping a single agent's actions through AGENTS.md, skills, hooks, and subagents.
This page is one level of abstraction up — it covers tools and patterns that treat agents as a manageable workforce.


## From engineering to managing

So far in this guide, you have been an **engineer** — you have worked interactively with a single agent, steering it turn by turn in real time.
Now, you will become a **manager**, delegating work to a fleet of agents running in parallel.
Instead of supervising each agent individually, you will manage the output queue — a review inbox, an issue tracker, a PR pipeline.
Your coding assistant no longer serves as a conductor, but as an orchestrator.

:::note[Remember]
The key shift is from "what should the agent do?" to "what work should be running right now, and how do I review what came back?"
:::

## Running agents in parallel

The key difference is running several agents simultaneously, each on an isolated task.
You hand different issues to separate agents at once, come back and review, and merge the ones you like.
That is qualitatively different from the sequential, one-task-at-a-time conductor workflow from the previous chapters.

[Subagents](/becoming-productive/harness-engineering/#subagents) are also parallel, but they are different: a subagent is spawned **by the agent** to partition a single task's context.
The agent decides when to spawn one, waits for the result, and folds it back into its own session.
You as the human still trigger one top-level session and review one result.

What is described here is different: **you** spawn multiple fully independent agent sessions, each assigned to a separate task.
No session knows about the others.
You do not need to wait for any single agent — you come back later and review the queue of results in bulk.

In practice, each agent needs its own isolated workspace — typically a separate Git worktree — so their changes do not interfere.
A dashboard or queue then surfaces results as agents finish, letting you review and merge at your own pace.

For example, <ExternalLink href="https://conductor.build/"/> is a tool built around this model,
running multiple AI coding agents (Claude Code and Codex) in parallel worktrees with a shared review dashboard.

## Scheduled and recurring agents

Agents do not always need to wait for you to trigger them — you can set them up in advance to run on a schedule.
The pattern is similar to a cron job or a CI pipeline: describe a recurring task, define when it should run, and have an agent execute it in the background.
Results land in a review inbox or are auto-archived if nothing needs attention.

This is well-suited for tasks like:
- Daily issue triage
- Surfacing and summarizing CI failures
- Generating release briefs
- Checking for regressions between versions

With scheduled agents, the process becomes closer to a CI pipeline than a chat window — an agent is no longer a tool you reach for, but a background process.

Example application features built around this pattern:
- <ExternalLink href="https://developers.openai.com/codex/app/automations"/>
- <ExternalLink href="https://support.claude.com/en/articles/13854387-schedule-recurring-tasks-in-cowork"/>
- <ExternalLink href="https://cursor.com/docs/cloud-agent/automations"/>

## Issue-tracker-driven orchestration

A natural extension of scheduled agents is wiring them directly to your issue tracker.
Instead of manually assigning tasks to agents, the system monitors a board and automatically spawns an agent for each new issue in scope.
Engineers decide what issues belong in scope; the orchestrator handles assignment and execution.

Agent behavior can be defined in a workflow file versioned alongside the code — the same way you version a CI pipeline.
When an agent finishes, it gathers evidence (CI results, PR review feedback, complexity analysis) for human review.

For example, <ExternalLink href="https://github.com/openai/symphony"/> is an open-source orchestration service that implements this pattern,
monitoring a Linear board and running a Codex agent per issue in an isolated workspace.

:::tip
Issue-tracker-driven orchestration works best on codebases that have adopted [harness engineering](/becoming-productive/harness-engineering/).
:::

## Agent communication

Running multiple agents in parallel may create coordination problems — agents must exchange information without overloading any one context window.
Two broad patterns have emerged.

The simpler one is **hub-and-spoke orchestration**, where a lead agent spawns workers, collects their outputs, and consolidates them.
Workers never communicate directly.
The benefit is simplicity, as the full picture is present in one place.
The cost is that every intermediate result, log line, and failed attempt flows back through the orchestrator's context, degrading its reasoning quality over time.

The more capable pattern is **collaborative teaming**, where agents share a task list, claim work independently, and can send messages directly to one another.
A worker can flag a dependency, request a peer review, or broadcast a finding without routing it through the lead.
The lead's context stays clean; coordination happens at the edges.

In practice, most pipelines fall somewhere on a spectrum between these extremes, often organized into three levels:

1. **Isolated workers** — each agent runs independently and returns its output to the caller.
2. **Orchestrated workflows** — outputs become inputs for the next stage via shared files or aggregated results.
3. **Collaborative teams** — agents share a task graph, can send direct or broadcast messages, and notify the lead when work completes.

The right level depends on how tightly coupled the tasks are.
Independent parallel tasks — security scans, test runs, lint checks — fit level 1 or 2 well.
Tasks that need to challenge or build on each other's intermediate findings call for level 3.

For reference, <ExternalLink href="https://code.claude.com/docs/en/agent-teams"/> implements level 3 with a shared task list, file-locked claiming, mailboxes for direct and broadcast messages, and idle notifications back to the lead.

## Code factories

Beyond specific products, there is an emerging pattern popularized by Ryan Carson under the name **Code Factory**.
The idea is a repository setup where agents autonomously write code, open pull requests, and a separate review agent validates those PRs with machine-verifiable evidence.
If validation passes, the PR merges without human intervention.

The continuous loop looks like this:

1. Agent writes code and opens a PR.
2. Risk-aware CI gates check the change.
3. A review agent inspects the PR and collects evidence — screenshots, test results, static analysis.
4. If all checks pass, the PR lands automatically.
5. If anything fails, the agent retries or flags the issue for human review.

:::caution
A Code Factory is only as good as its quality gates.
An automated pipeline that merges bad PRs is strictly worse than one that does nothing.
Invest in solid tests, linters, and CI before automating the merge step.
:::

- <ExternalLink href="https://x.com/ryancarson" />
Comment on lines +106 to +124
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@p3t3rzb I'm pretty much sure Ryan Carson wasn't the one who coined this term...? Can you please do a little bit more research on this topic?

I'd like to have a link to some quality writeup of it


## One-human companies

The code factory pattern is the technical foundation of a broader idea: that a single person with a well-configured agent fleet can operate at the scale that would previously have required a full engineering team.

This requires connecting agents to communication platforms, scheduling systems, and external services — turning a single machine into an always-on runtime that responds to messages, executes tasks, and ships work continuously.
As an example of tooling in this space, <ExternalLink href="https://openclaw.ai/"/> packages infrastructure for exactly this kind of setup.

In <ExternalLink href="https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve" />, Yegge argues that the engineering profession is reorganizing around exactly this spectrum.
His framing: most engineers are at the low end of AI adoption today, and those who stay there risk being outcompeted by engineers who learn to orchestrate agent fleets — to act as owners of work queues rather than writers of individual functions.
9 changes: 9 additions & 0 deletions src/data/links.csv
Original file line number Diff line number Diff line change
Expand Up @@ -19,25 +19,29 @@ https://claude.com/plugins/playground,Playground Claude Plugin,Anthropic,,2026-0
https://claude.com/pricing,Claude Subscription,,,2026-03-04
https://cli.github.com/,GitHub CLI | Take GitHub to the command line,,,2026-03-13
https://cline.bot/blog/post-mortem-unauthorized-cline-cli-npm,Unauthorized Cline CLI npm publish,Saoud Rizwan,2026-02-24,2026-03-16
https://code.claude.com/docs/en/agent-teams,Claude Code Agent Teams,Anthropic,,2026-04-08
https://code.claude.com/docs/en/best-practices#write-an-effective-claude-md,Best Practices for Claude Code - Claude Code Docs,Anthropic,,2026-03-04
https://code.claude.com/docs/en/hooks,Hooks reference - Claude Code Docs,Anthropic,,2026-03-13
https://code.claude.com/docs/en/security,Security - Claude Code Docs,Anthropic,,2026-03-16
https://code.claude.com/docs/en/sub-agents,Create custom subagents - Claude Code Docs,Anthropic,,2026-03-13
https://code.claude.com/docs/en/sub-agents#code-reviewer,Create custom subagents - Claude Code Docs,,,2026-03-05
https://coderabbit.ai/,CodeRabbit,,,2026-03-05
https://conductor.build/,Conductor,Melty Labs,,2026-03-25
https://context7.com/,Context7 - Up-to-date documentation for LLMs and AI code editors,,,2026-03-13
https://cursor.com/blog,Cursor Blog,,,2026-03-04
https://cursor.com/bugbot,Cursor Bugbot,,,2026-03-05
https://cursor.com/docs/agent/browser,Cursor Browser,,,2026-03-04
https://cursor.com/docs/agent/modes#debug,Cursor Debug Mode,,,2026-03-04
https://cursor.com/docs/agent/review,Cursor Review Agent,,,2026-03-04
https://cursor.com/docs/cloud-agent/automations,Cloud Agents Automations,Cursor,,2026-04-08
https://cursor.com/docs/context/rules,Cursor Rules,,,2026-03-04
https://cursor.com/docs/hooks,Hooks Docs,Cursor,,2026-03-13
https://cursor.com/docs/subagents,Cursor Subagents,Cursor,,2026-03-13
https://cursor.com/for/code-review,Reviewing Code with Cursor | Cursor Docs,,,2026-03-05
https://cursor.com/pricing,Cursor Subscription,,,2026-03-04
https://developers.openai.com/api/docs/guides/compaction,Compaction,OpenAI,,2026-03-04
https://developers.openai.com/codex/agent-approvals-security,Codex: Agent approvals & security,OpenAI,,2026-03-16
https://developers.openai.com/codex/app/automations,Automations in Codex app,OpenAI,,2026-03-25
https://developers.openai.com/codex/app/worktrees/#working-between-local-and-worktree,Worktrees,,,2026-03-10
https://developers.openai.com/codex/cli/features#run-local-code-review,Codex CLI features (run local code review),,,2026-03-05
https://developers.openai.com/codex/integrations/github/,Use Codex in GitHub,,,2026-03-05
Expand All @@ -56,6 +60,7 @@ https://github.com/mcp,GitHub MCP Registry,,,2026-03-13
https://github.com/microsoft/playwright-mcp,microsoft/playwright-mcp,Microsoft,,2026-03-13
https://github.com/mkaput,Marek Kaput,,,2026-03-04
https://github.com/openai/skills,openai/skills,OpenAI,,2026-03-12
https://github.com/openai/symphony,Symphony,OpenAI,,2026-03-25
https://github.com/software-mansion-labs/skills,software-mansion-labs/skills,Software Mansion,,2026-03-12
https://github.com/steipete/mcporter/,"steipete/mcporter: Call MCPs via TypeScript, masquerading as simple TypeScript API. Or package them as cli.",Peter Steinberger,,2026-03-04
https://github.com/topics/agent-skills,GitHub Topic: agent-skills,,,2026-03-12
Expand All @@ -73,9 +78,11 @@ https://lucumr.pocoo.org/,Thoughts and Writings,Armin Ronacher,,2026-03-04
https://mcp.grep.app/,mcp.grep.app,Vercel,,2026-03-04
https://mitchellh.com/,Blog,Mitchell Hashimoto,,2026-03-04
https://models.dev/,Models.dev - An open-source database of AI models,Opencode,,2026-03-04
https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve,From IDEs to AI Agents with Steve Yegge,Gergely Orosz,,2026-03-25
https://openai.com/chatgpt/pricing/,ChatGPT Subscription,,,2026-03-04
https://openai.com/index/harness-engineering/,Harness engineering: leveraging Codex in an agent-first world,OpenAI,2026-02-11,2026-03-04
https://openai.com/news/engineering/,OpenAI Engineering News,,,2026-03-04
https://openclaw.ai/,OpenClaw,Peter Steinberger,,2026-04-02
https://opencode.ai/docs/go/,Opencode Go,,,2026-03-04
https://platform.claude.com/docs/en/build-with-claude/compaction,Compaction,Anthropic,,2026-03-04
https://platform.claude.com/docs/en/resources/prompt-library/socratic-sage,Prompting best practices,Anthropic,,2026-03-04
Expand All @@ -95,6 +102,7 @@ https://skills.sh/mitsuhiko/agent-stuff/tmux,tmux skill,Armin Ronacher,2026-01-2
https://skills.sh/vercel-labs/agent-browser/agent-browser,agent-browser,Vercel,2026-01-16,2026-03-04
https://skills.sh/vercel-labs/agent-skills/vercel-react-best-practices,vercel-react-best-practices skill,Vercel,2026-01-16,2026-03-04
https://support.apple.com/guide/mac-help/mh40584/mac,Dictate messages and documents on Mac - Apple Support,,,2026-03-10
https://support.claude.com/en/articles/13854387-schedule-recurring-tasks-in-cowork,Schedule recurring tasks in Cowork,Anthropic,,2026-04-08
https://support.microsoft.com/en-us/windows/use-voice-typing-to-talk-instead-of-type-on-your-pc-fec94565-c4bd-329d-e59a-af033fa5689f,Use voice typing to talk instead of type on your PC - Microsoft Support,,,2026-03-10
https://swmansion.com/,Software Mansion,,,2026-03-04
https://tidewave.ai/,Tidewave,,,2026-03-04
Expand All @@ -110,6 +118,7 @@ https://x.com/GeminiApp,Google Gemini (@GeminiApp) on X,,,2026-03-04
https://x.com/karpathy,Andrej Karpathy (@karpathy) on X,,,2026-03-04
https://x.com/opencode,OpenCode (@opencode) on X,,,2026-03-04
https://x.com/RLanceMartin,Lance Martin (@RLanceMartin) on X,,,2026-03-04
https://x.com/ryancarson,Ryan Carson (@ryancarson) on X,,,2026-03-25
https://x.com/thorstenball,Thorsten Ball (@thorstenball) on X,,,2026-03-04
https://x.com/thsottiaux,Tibo (@thsottiaux) on X,,,2026-03-04
https://x.com/trq212,Thariq Shihipar (@trq212) on X,,,2026-03-04
Expand Down