-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Overview
Embed a live terminal view of agent sessions into the Gastown dashboard using xterm.js connected to the container's existing PTY infrastructure. This gives users direct visibility into what agents are doing — the same interface a local user sees — and optionally lets them interact with agents (send follow-ups, answer questions, abort).
Parent: #204
Background
The container runs agents via createOpencode() (the @kilocode/sdk). Each in-process kilo serve instance already exposes /pty/* routes on its internal port via bun-pty — full PTY infrastructure exists but is unused because agents are currently headless. No one connects to it.
The Kilo desktop app uses ghostty-web (WASM terminal) connected via WebSocket to these same PTY endpoints. The cloud needs the same connection, just plumbed through the Cloudflare Worker → container boundary.
Architecture
Browser (xterm.js React component)
↕ WebSocket (raw PTY bytes)
Gastown Worker (WS upgrade + proxy)
↕ containerFetch / WS proxy
TownContainerDO
↕ WebSocket proxy to container port 8080
Container control-server (:8080)
↕ Internal WS proxy (new routes)
kilo serve (:4096+) — already has /pty/* routes
↕ bun-pty
Shell in agent worktree
Implementation
1. Container control-server: PTY proxy routes (~50 lines)
Add routes to proxy PTY operations to the internal kilo serve instances:
// POST /agents/:agentId/pty — create a PTY session
// GET /agents/:agentId/pty — list PTY sessions
// PUT /agents/:agentId/pty/:ptyId — resize
// DELETE /agents/:agentId/pty/:ptyId — destroy
// WS /agents/:agentId/pty/:ptyId/connect — bidirectional PTY byte streamEach route looks up the agent's serverPort from the agents Map and proxies to http://127.0.0.1:${serverPort}/pty/.... The WebSocket route does bidirectional byte forwarding.
2. TownContainerDO: PTY WebSocket proxying
Extend the existing WebSocket proxying (used for agent event streams) to support PTY connections. The PTY WebSocket carries raw bytes (not JSON events), so the proxy is a simple bidirectional pipe — no parsing or transformation needed.
3. Gastown Worker: PTY route + WS upgrade
Add a route that upgrades to WebSocket and forwards to the TownContainerDO:
WS /api/towns/:townId/agents/:agentId/pty/:ptyId/connect
Auth uses the existing town auth middleware (CF Access + org membership check).
4. Frontend: xterm.js React component
A self-contained React component (~200 lines):
interface AgentTerminalProps {
townId: string;
agentId: string;
ptyId?: string; // auto-create if not provided
}- Uses
@xterm/xtermwith@xterm/addon-fitfor auto-sizing - Connects to
wss://gastown-api/api/towns/${townId}/agents/${agentId}/pty/${ptyId}/connect - Handles resize events (sends resize to the PTY via PUT)
- Reconnection with backfill on disconnect
5. Dashboard integration
The terminal component embeds in two places:
- Agent detail panel: Click "Terminal" tab to see the live agent session
- Agent stream panel: The existing "Watch" button on agent cards opens the terminal view
The terminal coexists with the structured event stream — users can switch between "Events" (the JSON-based message/tool stream) and "Terminal" (the raw PTY view). Both show the same agent session from different angles.
What this enables
- See exactly what agents see — the real kilo CLI output, not a filtered/reformatted version
- Interactive agent sessions — type into the terminal to send follow-up messages, answer permission prompts, or course-correct
- Debugging — when an agent is stuck, see the raw terminal state (error messages, prompts, hung processes)
- Familiar interface — anyone who's used kilo locally knows exactly what they're looking at
xterm.js vs ghostty-web
The desktop app uses ghostty-web (Ghostty's WASM build). For the cloud:
| Aspect | ghostty-web | xterm.js |
|---|---|---|
| Size | Larger WASM binary | ~200KB JS |
| React integration | Manual DOM mounting | Native @xterm/xterm package |
| Ecosystem | Smaller | Mature, widely used |
| Protocol | Same — raw bytes over WS | Same — raw bytes over WS |
Both consume the same WebSocket protocol. xterm.js is the better fit for a React/Next.js app — smaller bundle, native integration, large ecosystem of addons.
Acceptance Criteria
- Container control-server has PTY proxy routes (create, list, resize, destroy, WS connect)
- TownContainerDO proxies PTY WebSocket connections
- Gastown worker exposes PTY WS upgrade endpoint with auth
- React component renders xterm.js terminal connected to agent PTY
- Terminal auto-fits to container dimensions and handles resize
- Terminal is accessible from agent detail panel in the dashboard
- User can type into the terminal to interact with the agent session
- Terminal reconnects gracefully on WebSocket disconnect