Skip to content

ATGCS/OpenClaw-Book

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Dive into OpenClaw — Table of Contents

Generated by: Claude Opus 4.6 (anthropic/claude-opus-4-6) Tools: OpenClaw + OpenCode Date: 2026-03-11 GitHub: https://github.com/0xtresser/OpenClaw-Book


Language

中文版本

  • Motivation and target audience for this book
  • Overview of the OpenClaw project: a personal AI assistant running on your own devices
  • Book structure and reading guide
  • Source code version note (based on v2026.3.9)
  • Prerequisites: TypeScript basics, Node.js runtime concepts, WebSocket protocol, basic OS knowledge

Part I: Global Overview

Chapter 1: Understanding OpenClaw

  • 1.1 What is OpenClaw: The Design Philosophy of a Personal AI Assistant
    • 1.1.1 "Local-first" philosophy vs. traditional SaaS
    • 1.1.2 Single-user design: why not multi-tenant
    • 1.1.3 Unified multi-channel inbox: WhatsApp / Telegram / Slack / Discord / Signal / iMessage / WebChat, etc.
  • 1.2 Core Architecture Overview
    • 1.2.1 Architecture diagram: Message Channels → Gateway Control Plane → Pi Agent Runtime → Tool Execution
    • 1.2.2 Key subsystems at a glance: Gateway, Agent, Channel, Tools, Skills, Memory, Canvas
    • 1.2.3 Full data flow: the complete path of a message from user to AI response
  • 1.3 Technology Stack Overview
    • 1.3.1 TypeScript + Node.js 22+: language and runtime choices
    • 1.3.2 pnpm Monorepo structure (pnpm-workspace.yaml)
    • 1.3.3 Build toolchain: tsdown / oxlint / oxfmt / vitest
    • 1.3.4 Native apps: Swift (macOS/iOS), Kotlin (Android)
  • 1.4 Project Directory Structure
    • 1.4.1 src/ — Core source code (69 subdirectories)
    • 1.4.2 apps/ — Native client apps (macOS / iOS / Android)
    • 1.4.3 ui/ — Web Console UI (Lit + Vite)
    • 1.4.4 extensions/ — Optional channel extensions (31 extensions)
    • 1.4.5 packages/ — Internal shared packages
    • 1.4.6 skills/ — Built-in skills (52 skills)
    • 1.4.7 docs/ — Official documentation source

Chapter 2: Getting Started & Development Environment

  • 2.1 Environment Setup
    • 2.1.1 Node.js 22+, pnpm installation
    • 2.1.2 Cloning and building from source (pnpm install && pnpm build)
    • 2.1.3 UI build (pnpm ui:build)
  • 2.2 Onboarding Wizard
    • 2.2.1 openclaw onboard command entry point (src/wizard/onboarding.ts)
    • 2.2.2 Wizard flow: Gateway config → Model selection → Channel connection → Skill installation
    • 2.2.3 Interactive prompt implementation: Clack Prompts library
  • 2.3 Starting the Gateway Daemon
    • 2.3.1 Foreground vs. daemon mode (src/daemon/)
    • 2.3.2 launchd / systemd service installation source analysis
  • 2.4 Development Workflow
    • 2.4.1 pnpm gateway:watch — File watching and hot reload
    • 2.4.2 Testing: vitest unit tests / e2e tests / live tests / Docker tests
    • 2.4.3 Code quality: oxlint type-aware linting + oxfmt formatting

Part II: Gateway Control Plane

Chapter 3: Gateway Server Architecture

  • 3.1 Gateway Role and Design Goals
    • 3.1.1 What "Control Plane" means
    • 3.1.2 Single Gateway instance constraint: why only one Gateway per host
    • 3.1.3 Gateway responsibilities: channel management, session management, tool routing, event dispatch
  • 3.2 Server Startup Flow Source Analysis
    • 3.2.1 Entry file src/entry.ts: process title, experimental warning suppression, CLI dispatch
    • 3.2.2 Gateway CLI startup (src/cli/gateway-cli.ts)
    • 3.2.3 Server startup sequence (src/gateway/server-startup.ts)
    • 3.2.4 Server implementation class (src/gateway/server.ts and src/gateway/server.impl.ts)
  • 3.3 WebSocket Server Implementation
    • 3.3.1 WebSocket transport layer: ws library usage and configuration
    • 3.3.2 WebSocket runtime management (src/gateway/server-ws-runtime.ts)
    • 3.3.3 Connection lifecycle: connect handshake → auth → event subscription → disconnect
    • 3.3.4 Frame protocol format: {type:"req", id, method, params} / {type:"res"} / {type:"event"}
  • 3.4 HTTP Layer
    • 3.4.1 HTTP server setup (src/gateway/server-http.ts)
    • 3.4.2 OpenAI-compatible HTTP API (src/gateway/openai-http.ts)
    • 3.4.3 Open Responses API (src/gateway/openresponses-http.ts)
    • 3.4.4 Tool invocation HTTP API (src/gateway/tools-invoke-http.ts)

Chapter 4: Gateway Protocol and Type System

  • 4.1 Protocol Design Philosophy
    • 4.1.1 Why WebSocket over REST
    • 4.1.2 Request-response + server push events hybrid model
    • 4.1.3 Idempotency Key mechanism to prevent duplicate operations
  • 4.2 TypeBox Type Schema
    • 4.2.1 Protocol schema definitions (src/gateway/protocol/schema/)
    • 4.2.2 Generating JSON Schema from TypeBox (scripts/protocol-gen.ts)
    • 4.2.3 Generating Swift models from JSON Schema (scripts/protocol-gen-swift.ts)
    • 4.2.4 Protocol consistency validation (pnpm protocol:check)
  • 4.3 Core Methods and Events
    • 4.3.1 Request method catalog (src/gateway/server-methods-list.ts)
    • 4.3.2 connect — Handshake and authentication
    • 4.3.3 agent / agent.wait — Initiating AI conversations
    • 4.3.4 send — Sending messages to channels
    • 4.3.5 sessions.* — Session management method family
    • 4.3.6 Event types: agent, chat, presence, health, heartbeat, cron
  • 4.4 Authentication and Authorization
    • 4.4.1 Gateway Token authentication (src/gateway/auth.ts)
    • 4.4.2 Device Pairing mechanism (src/gateway/device-auth.ts)
    • 4.4.3 Local connection auto-approval vs. remote connection challenge signing
    • 4.4.4 Origin check to prevent cross-site WebSocket hijacking (src/gateway/origin-check.ts)

Chapter 5: Session Management

  • 5.1 Session Model Design
    • 5.1.1 Session Key structure: agent:<agentId>:<mainKey>
    • 5.1.2 Session scope (DM Scope): main / per-peer / per-channel-peer / per-account-channel-peer
    • 5.1.3 Session persistence: JSON Store + JSONL transcript files
  • 5.2 Session Routing Source Analysis
    • 5.2.1 Route bindings (src/routing/bindings.ts)
    • 5.2.2 Route resolution (src/routing/resolve-route.ts)
    • 5.2.3 Session key generation (src/routing/session-key.ts)
    • 5.2.4 Key mapping rules for DM vs. group vs. Cron vs. Webhook sessions
  • 5.3 Session Lifecycle
    • 5.3.1 Creation and initialization: first message triggers session creation
    • 5.3.2 Reset strategies: Daily Reset, Idle Reset
    • 5.3.3 Per-type/per-channel reset overrides (resetByType / resetByChannel)
    • 5.3.4 Session Patch: runtime modification of session properties (src/gateway/sessions-patch.ts)
  • 5.4 Session Pruning
    • 5.4.1 Tool result pruning: trimming old tool results before LLM calls
    • 5.4.2 Context window guard (src/agents/context-window-guard.ts)
  • 5.5 Inter-Session Communication (Agent-to-Agent)
    • 5.5.1 sessions_list / sessions_history / sessions_send tools
    • 5.5.2 Cross-session message passing and Reply Ping-Pong

Chapter 6: Channel Routing and Message Dispatch

  • 6.1 Channel Registry
    • 6.1.1 Channel registration mechanism (src/channels/registry.ts)
    • 6.1.2 Channel capability declarations (src/config/channel-capabilities.ts)
    • 6.1.3 Channel Dock layer (src/channels/dock.ts)
  • 6.2 Inbound Message Processing Pipeline
    • 6.2.1 Message reception and normalization (src/channels/chat-type.ts)
    • 6.2.2 Sender identity resolution (src/channels/sender-identity.ts)
    • 6.2.3 Allowlist matching (src/channels/allowlists/)
    • 6.2.4 Mention Gating (src/channels/mention-gating.ts)
    • 6.2.5 Command Gating (src/channels/command-gating.ts)
  • 6.3 Outbound Message Processing
    • 6.3.1 Message chunking algorithm and channel limits
    • 6.3.2 ACK Reactions (src/channels/ack-reactions.ts)
    • 6.3.3 Typing Indicators (src/channels/typing.ts)
  • 6.4 Multi-Agent Routing
    • 6.4.1 Multi-agent configuration: agents.list[] and bindings
    • 6.4.2 Channel/account/peer-based routing rules
    • 6.4.3 Sub-agent registry (src/agents/subagent-registry.ts)

Part III: AI Agent Runtime

Chapter 7: Pi Agent Runtime Core

  • 7.1 What is Pi Agent
    • 7.1.1 @mariozechner/pi-agent-core / pi-ai / pi-coding-agent dependency analysis
    • 7.1.2 RPC mode vs. embedded mode
    • 7.1.3 Agent entry file hierarchy: src/agents/pi-embedded.tspi-embedded-runner.ts
  • 7.2 Agent Loop End-to-End Analysis
    • 7.2.1 Loop entry: Gateway's agent RPC method
    • 7.2.2 Step 1: Parameter validation and session resolution
    • 7.2.3 Step 2: agentCommand — model resolution, skill snapshot loading
    • 7.2.4 Step 3: runEmbeddedPiAgent — queue serialization, auth profile resolution, Pi session construction
    • 7.2.5 Step 4: subscribeEmbeddedPiSession — event bridging (tool → assistant → lifecycle)
    • 7.2.6 Step 5: Result aggregation, usage statistics, session persistence
  • 7.3 Queue and Concurrency Control
    • 7.3.1 Per-session Lane serialization
    • 7.3.2 Global Lane rate limiting
    • 7.3.3 Queue modes: collect / steer / followup
    • 7.3.4 Lane implementation (src/gateway/server-lanes.ts)
  • 7.4 Timeout and Abort Mechanisms
    • 7.4.1 Agent execution timeout (default 600 seconds)
    • 7.4.2 agent.wait wait timeout (default 30 seconds)
    • 7.4.3 AbortSignal cancellation chain
    • 7.4.4 Chat abort (src/gateway/chat-abort.ts)

Chapter 8: Model Providers and Failover

  • 8.1 Model Selection Mechanism
    • 8.1.1 Primary model → fallback model → provider-internal auth failover: three-tier strategy
    • 8.1.2 Model selection source (src/agents/model-selection.ts)
    • 8.1.3 Model fallback (src/agents/model-fallback.ts)
    • 8.1.4 Model compatibility layer (src/agents/model-compat.ts)
  • 8.2 Auth Profile and Credential Management
    • 8.2.1 Auth Profile mechanism (src/agents/auth-profiles.ts)
    • 8.2.2 OAuth flow: Anthropic Claude Pro/Max, OpenAI ChatGPT/Codex
    • 8.2.3 API Key authentication
    • 8.2.4 Profile rotation and cooldown
    • 8.2.5 GitHub Copilot Token authentication (src/providers/github-copilot-token.ts)
  • 8.3 Model Catalog and Configuration
    • 8.3.1 Model catalog (src/agents/model-catalog.ts)
    • 8.3.2 Model scanning: OpenRouter free model discovery (src/agents/model-scan.ts)
    • 8.3.3 Model config file (models.json) and provider config (src/agents/models-config.ts)
    • 8.3.4 Synthetic models and special providers (Venice, Chutes, Z.AI, etc.)
  • 8.4 Failover Error Handling
    • 8.4.1 Error classification (src/agents/failover-error.ts)
    • 8.4.2 Auth errors, billing errors, context overflow detection and handling
    • 8.4.3 Failover logging and user-visible error messages

Chapter 9: System Prompts and Context Assembly

  • 9.1 Building System Prompts
    • 9.1.1 Base prompt template (src/agents/system-prompt.ts)
    • 9.1.2 Prompt parameters (src/agents/system-prompt-params.ts)
    • 9.1.3 Prompt report (src/agents/system-prompt-report.ts)
  • 9.2 Workspace and Context File Injection
    • 9.2.1 Workspace root resolution (src/agents/workspace.ts)
    • 9.2.2 Bootstrap Files: AGENTS.md / SOUL.md / TOOLS.md (src/agents/bootstrap-files.ts)
    • 9.2.3 Bootstrap Hooks (src/agents/bootstrap-hooks.ts)
    • 9.2.4 Workspace templates (src/agents/workspace-templates.ts)
  • 9.3 Identity System
    • 9.3.1 AI assistant identity (src/agents/identity.ts)
    • 9.3.2 Identity file (src/agents/identity-file.ts)
    • 9.3.3 Identity avatar (src/agents/identity-avatar.ts)
    • 9.3.4 Channel-prefixed identity
  • 9.4 Context Compaction
    • 9.4.1 Auto-compaction trigger mechanism
    • 9.4.2 Compaction flow (src/agents/compaction.ts)
    • 9.4.3 Pre-compaction Memory Flush
    • 9.4.4 Compaction retry and buffer reset
  • 9.5 Context Engine Plugin System
    • 9.5.1 Why Context Engine
    • 9.5.2 Plugin Interface (ContextEngine Interface)
    • 9.5.3 Registration and Resolution Mechanism
    • 9.5.4 LegacyContextEngine Backward Compatibility
    • 9.5.5 Runtime Integration
    • 9.5.6 Developing Custom Context Engines

Chapter 10: Streaming and Block Replies

  • 10.1 Streaming Architecture
    • 10.1.1 Pi Agent Core event stream → OpenClaw event bridge
    • 10.1.2 Stream event types: lifecycle / assistant / tool
    • 10.1.3 Raw stream processing (src/agents/pi-embedded-subscribe.raw-stream.ts)
  • 10.2 Block Streaming
    • 10.2.1 EmbeddedBlockChunker algorithm (src/agents/pi-embedded-block-chunker.ts)
    • 10.2.2 Low watermark / high watermark chunking strategy
    • 10.2.3 Break preferences: paragraph → newline → sentence → whitespace → hard break
    • 10.2.4 Fenced block safe splitting: close + reopen
  • 10.3 Coalescing and Humanized Pacing
    • 10.3.1 Consecutive block coalescing: idle gap, min/max char count
    • 10.3.2 Human Delay: natural / custom modes
  • 10.4 Telegram Draft Streaming
    • 10.4.1 sendMessageDraft implementation
    • 10.4.2 partial mode vs. block mode
  • 10.5 Reply Shaping and Suppression
    • 10.5.1 NO_REPLY silent token filtering
    • 10.5.2 Message tool deduplication
    • 10.5.3 Tool summary inlining

Part IV: Multi-Channel Messaging System

Chapter 11: Channel Adapter Abstraction Layer

  • 11.1 Channel Adapter Design Pattern
    • 11.1.1 Channel core interface analysis
    • 11.1.2 Channel configuration type system (src/config/types.channels.ts)
    • 11.1.3 Channel-Gateway connection bridge
  • 11.2 Inbound Message Normalization
    • 11.2.1 Format unification: text, image, audio, video, file
    • 11.2.2 Media attachment handling (src/gateway/chat-attachments.ts)
    • 11.2.3 Message sanitization (src/gateway/chat-sanitize.ts)
  • 11.3 Outbound Message Adaptation
    • 11.3.1 Markdown formatting and channel specifics (src/markdown/)
    • 11.3.2 Reply prefix (src/channels/reply-prefix.ts)
    • 11.3.3 Conversation label (src/channels/conversation-label.ts)

Chapter 12: Core Channel Implementations Deep Dive

  • 12.1 WhatsApp Channel (Baileys)
    • 12.1.1 Baileys library overview: unofficial Web WhatsApp protocol implementation
    • 12.1.2 WhatsApp Web login and QR code pairing (src/web/login-qr.ts)
    • 12.1.3 Inbound message listener (src/web/inbound.ts)
    • 12.1.4 Outbound message sending (src/web/outbound.ts)
    • 12.1.5 Auto-reply system (src/web/auto-reply.ts and auto-reply.impl.ts)
    • 12.1.6 Session reconnection (src/web/reconnect.ts)
    • 12.1.7 Media handling (src/web/media.ts)
  • 12.2 Telegram Channel (grammY)
    • 12.2.1 grammY framework and Bot API integration
    • 12.2.2 Telegram Bot configuration and Webhook mode
    • 12.2.3 Draft streaming implementation
    • 12.2.4 Custom commands (src/config/telegram-custom-commands.ts)
  • 12.3 Discord Channel
    • 12.3.1 discord.js / Carbon library usage
    • 12.3.2 Guild management, DM strategy
    • 12.3.3 Native Slash commands and text commands
    • 12.3.4 Per-message line limit (maxLinesPerMessage)
  • 12.4 Slack Channel (Bolt)
    • 12.4.1 Slack Bolt SDK integration
    • 12.4.2 App Token + Bot Token dual-token architecture
    • 12.4.3 Thread-based session management
  • 12.5 Other Core Channels
    • 12.5.1 Signal (signal-cli)
    • 12.5.2 BlueBubbles (recommended iMessage integration)
    • 12.5.3 iMessage Legacy (macOS native imsg)
    • 12.5.4 WebChat (Gateway built-in web chat)

Chapter 13: Channel Extension Mechanism

  • 13.1 Extension Architecture Design
    • 13.1.1 Differences between extensions and core channels
    • 13.1.2 Extension loading mechanism (src/gateway/server-plugins.ts)
    • 13.1.3 Extension directory structure analysis (using extensions/msteams/ as example)
  • 13.2 Extension API Surface
    • 13.2.1 src/extensionAPI.ts — API set accessible to extensions
    • 13.2.2 Plugin SDK (src/plugin-sdk/)
    • 13.2.3 Plugin Hooks interface
  • 13.3 Representative Extension Implementations
    • 13.3.1 Microsoft Teams extension (Bot Framework integration)
    • 13.3.2 Matrix extension (matrix-sdk-crypto-nodejs)
    • 13.3.3 Twitch extension
    • 13.3.4 Nostr extension
    • 13.3.5 Google Chat extension
    • 13.3.6 Zalo / Zalo Personal extension
    • 13.3.7 Feishu/Lark extension
    • 13.3.8 LINE extension
  • 13.4 Developing Custom Extensions
    • 13.4.1 Extension scaffolding setup
    • 13.4.2 Implementing the channel adapter interface
    • 13.4.3 Registering with the plugin system
    • 13.4.4 Testing and debugging

Part V: Tool System and Automation

Chapter 14: Tool System Architecture

  • 14.1 Tool Definition and Classification
    • 14.1.1 Tool types: bash, browser, canvas, cron, sessions, nodes, channel
    • 14.1.2 Tool schema definition (src/agents/pi-tools.schema.ts)
    • 14.1.3 Tool definition adapter (src/agents/pi-tool-definition-adapter.ts)
  • 14.2 Tool Registration and Policy
    • 14.2.1 Tool registration flow (src/agents/pi-tools.ts)
    • 14.2.2 Tool Policy: allow list / deny list (src/agents/tool-policy.ts)
    • 14.2.3 Tool pre-interception (Before Tool Call) (src/agents/pi-tools.before-tool-call.ts)
    • 14.2.4 Tool display name mapping (src/agents/tool-display.json)
  • 14.3 Tool Execution and Result Handling
    • 14.3.1 Tool call ID generation (src/agents/tool-call-id.ts)
    • 14.3.2 Tool result guard (src/agents/session-tool-result-guard.ts)
    • 14.3.3 Tool image processing (src/agents/tool-images.ts)
    • 14.3.4 Tool summary generation (src/agents/tool-summaries.ts)
  • 14.4 Execution Approval Mechanism
    • 14.4.1 Execution approval manager (src/gateway/exec-approval-manager.ts)
    • 14.4.2 User approval workflow

Chapter 15: Bash Tools and Process Management

  • 15.1 Bash Execution Engine
    • 15.1.1 src/agents/bash-tools.exec.ts — Command execution core
    • 15.1.2 PTY terminal emulation (src/agents/bash-tools.exec.pty.ts)
    • 15.1.3 PTY fallback mechanism (src/agents/bash-tools.exec.pty-fallback.ts)
    • 15.1.4 PATH environment and safe bin list (src/agents/pi-tools.safe-bins.ts)
  • 15.2 Process Management
    • 15.2.1 Process tools (src/agents/bash-tools.process.ts)
    • 15.2.2 Background process registry (src/agents/bash-process-registry.ts)
    • 15.2.3 Process send-keys operations
  • 15.3 Shell Tools and Shared Infrastructure
    • 15.3.1 Shell utility set (src/agents/shell-utils.ts)
    • 15.3.2 Workspace run (src/agents/workspace-run.ts)

Chapter 16: Browser Control

  • 16.1 Browser Architecture Overview
    • 16.1.1 Playwright + CDP (Chrome DevTools Protocol) hybrid approach
    • 16.1.2 OpenClaw-managed Chrome/Chromium instances (src/browser/chrome.ts)
    • 16.1.3 Browser profile management (src/browser/profiles.ts)
  • 16.2 CDP Layer Implementation
    • 16.2.1 CDP connection management (src/browser/cdp.ts)
    • 16.2.2 CDP helper functions (src/browser/cdp.helpers.ts)
    • 16.2.3 Target ID management and tab operations (src/browser/target-id.ts)
  • 16.3 Playwright Layer Implementation
    • 16.3.1 Playwright session management (src/browser/pw-session.ts)
    • 16.3.2 AI assistance module (src/browser/pw-ai.ts)
    • 16.3.3 Role Snapshot (src/browser/pw-role-snapshot.ts)
    • 16.3.4 Playwright tool core (src/browser/pw-tools-core.ts)
  • 16.4 Browser Server
    • 16.4.1 Browser HTTP server (src/browser/server.ts)
    • 16.4.2 Server context and tab management (src/browser/server-context.ts)
    • 16.4.3 Client actions layer (src/browser/client-actions.ts)
    • 16.4.4 Browser extension relay (src/browser/extension-relay.ts)
    • 16.4.5 Bridge Server (src/browser/bridge-server.ts)

Chapter 17: Canvas and A2UI

  • 17.1 Canvas Concepts
    • 17.1.1 What is Canvas: agent-driven visual workspace
    • 17.1.2 A2UI (Agent-to-UI): AI Agent directly manipulating user interfaces
  • 17.2 Canvas Host Implementation
    • 17.2.1 Canvas server (src/canvas-host/server.ts)
    • 17.2.2 A2UI core (src/canvas-host/a2ui.ts and a2ui/)
    • 17.2.3 A2UI bundling (scripts/bundle-a2ui.sh)
  • 17.3 Canvas Tools
    • 17.3.1 push / reset / eval / snapshot operations
    • 17.3.2 Canvas surfaces on macOS / iOS / Android

Chapter 18: Cron Scheduling and Automation

  • 18.1 Cron System Design
    • 18.1.1 Cron service (src/cron/service.ts)
    • 18.1.2 Scheduling engine (src/cron/schedule.ts) and Croner library
    • 18.1.3 Cron expression parsing (src/cron/parse.ts)
    • 18.1.4 Cron normalization (src/cron/normalize.ts)
  • 18.2 Cron Job Execution
    • 18.2.1 Isolated Agent execution (src/cron/isolated-agent.ts)
    • 18.2.2 Delivery Plan (src/cron/delivery.ts)
    • 18.2.3 Run log (src/cron/run-log.ts)
    • 18.2.4 Cron storage and migration (src/cron/store.ts)
  • 18.3 Webhooks and Gmail Pub/Sub
    • 18.3.1 Webhook triggers
    • 18.3.2 Gmail Pub/Sub email hooks (src/hooks/gmail.ts)
    • 18.3.3 Hook system overview (src/hooks/hooks.ts)

Chapter 19: Node System

  • 19.1 Node Concepts
    • 19.1.1 What is a Node: remote exposure of device capabilities
    • 19.1.2 Node roles: macOS / iOS / Android / headless
    • 19.1.3 Node commands: canvas.* / camera.* / screen.record / location.get / system.run / system.notify
  • 19.2 Node Registration and Discovery
    • 19.2.1 Node registry (src/gateway/node-registry.ts)
    • 19.2.2 Node event system (src/gateway/server-node-events.ts)
    • 19.2.3 Node subscriptions (src/gateway/server-node-subscriptions.ts)
    • 19.2.4 Node command policy (src/gateway/node-command-policy.ts)
  • 19.3 Node Host Implementation
    • 19.3.1 src/node-host/runner.ts — Node command executor
    • 19.3.2 src/node-host/config.ts — Node configuration

Part VI: Memory, Skills, and Ecosystem

Chapter 20: Memory System

  • 20.1 Memory Model Design
    • 20.1.1 "Pure Markdown as Memory" design philosophy
    • 20.1.2 Memory file layout: MEMORY.md (long-term) + memory/YYYY-MM-DD.md (daily log)
    • 20.1.3 Memory manager (src/memory/manager.ts)
  • 20.2 Vector Memory Search
    • 20.2.1 Embedding engine
    • 20.2.2 Multi-provider support: OpenAI / Gemini / Voyage / Local (node-llama-cpp) (src/memory/embeddings.ts)
    • 20.2.3 SQLite storage (src/memory/sqlite.ts)
    • 20.2.4 sqlite-vec vector acceleration (src/memory/sqlite-vec.ts)
  • 20.3 Hybrid Search (BM25 + Vector)
    • 20.3.1 BM25 full-text search principles
    • 20.3.2 Hybrid retrieval implementation (src/memory/hybrid.ts)
    • 20.3.3 Score fusion strategy: weighted linear combination
  • 20.4 Advanced Memory Features
    • 20.4.1 Batch indexing (src/memory/batch-openai.ts / batch-gemini.ts)
    • 20.4.2 Embedding cache mechanism
    • 20.4.3 Session memory search (experimental)
    • 20.4.4 QMD backend (BM25 + Vector + reranking)
    • 20.4.5 Search manager (src/memory/search-manager.ts)

Chapter 21: Skill System

  • 21.1 Skill Platform Design
    • 21.1.1 What is a Skill: hot-pluggable Agent capability modules
    • 21.1.2 Skill types: Bundled / Managed / Workspace
    • 21.1.3 Skill loading and snapshots (src/agents/skills.ts)
  • 21.2 Skill Structure
    • 21.2.1 SKILL.md file specification
    • 21.2.2 Built-in skill catalog analysis (52 skills)
    • 21.2.3 Representative skill walkthrough: coding-agent, github, discord, canvas, weather, etc.
  • 21.3 Skill Installation and Management
    • 21.3.1 Skill installation flow (src/agents/skills-install.ts)
    • 21.3.2 Skill status management (src/agents/skills-status.ts)
    • 21.3.3 Skill CLI commands (src/cli/skills-cli.ts)
  • 21.4 ClawHub Skill Registry
    • 21.4.1 ClawHub design and functionality
    • 21.4.2 Automatic skill search and installation

Chapter 22: Hook System

  • 22.1 Internal Hooks (Gateway Hooks)
    • 22.1.1 Hook loader (src/hooks/loader.ts)
    • 22.1.2 Hook installation (src/hooks/install.ts)
    • 22.1.3 Built-in hooks (src/hooks/bundled/)
    • 22.1.4 agent:bootstrap hook
    • 22.1.5 Command hooks: /new, /reset, /stop
  • 22.2 Plugin Hooks
    • 22.2.1 Hook mapping (src/gateway/hooks-mapping.ts)
    • 22.2.2 Agent lifecycle hooks: before_agent_start / agent_end
    • 22.2.3 Tool lifecycle hooks: before_tool_call / after_tool_call / tool_result_persist
    • 22.2.4 Message hooks: message_received / message_sending / message_sent
    • 22.2.5 Gateway hooks: gateway_start / gateway_stop

Part VII: Security, Configuration, and Infrastructure

Chapter 23: Configuration System

  • 23.1 Configuration Loading and Parsing
    • 23.1.1 Config file location: ~/.openclaw/openclaw.json (JSON5 format)
    • 23.1.2 Config schema (src/config/schema.ts)
    • 23.1.3 Zod Schema validation (src/config/zod-schema.ts)
    • 23.1.4 Config IO (src/config/io.ts)
    • 23.1.5 Config path resolution (src/config/config-paths.ts)
  • 23.2 Configuration Type System Deep Dive
    • 23.2.1 Core types overview (src/config/types.ts)
    • 23.2.2 Agent config (types.agents.ts / types.agent-defaults.ts)
    • 23.2.3 Channel config (types.channels.ts / types.discord.ts / types.telegram.ts, etc.)
    • 23.2.4 Security config (types.sandbox.ts / types.auth.ts)
    • 23.2.5 Model config (types.models.ts)
    • 23.2.6 Tool/Skill/Hook config (types.tools.ts / types.skills.ts / types.hooks.ts)
  • 23.3 Configuration Hot Reload
    • 23.3.1 Config reload mechanism (src/gateway/config-reload.ts)
    • 23.3.2 Runtime overrides (src/config/runtime-overrides.ts)
    • 23.3.3 Server reload handlers (src/gateway/server-reload-handlers.ts)
  • 23.4 Legacy Configuration Migration
    • 23.4.1 Migration framework (src/config/legacy.ts)
    • 23.4.2 Migration rules (src/config/legacy.rules.ts)
    • 23.4.3 Phased migrations (legacy.migrations.part-1/2/3.ts)
  • 23.5 Environment Variables
    • 23.5.1 Environment variable substitution (src/config/env-substitution.ts)
    • 23.5.2 Environment variable list (src/config/env-vars.ts)

Chapter 24: Security Model

  • 24.1 Security Design Principles
    • 24.1.1 Inbound DMs as untrusted input
    • 24.1.2 Secure by default vs. explicit opt-in
    • 24.1.3 Prompt injection defense
  • 24.2 DM Pairing System
    • 24.2.1 Pairing flow: code generation → user approval → allowlist persistence
    • 24.2.2 Pairing CLI (src/cli/pairing-cli.ts)
    • 24.2.3 DM policy: pairing / open
  • 24.3 Sandbox Mechanism
    • 24.3.1 Sandbox design: Docker isolation for non-main sessions (src/agents/sandbox.ts)
    • 24.3.2 Sandbox configuration parsing (src/agents/sandbox/)
    • 24.3.3 Sandbox path management (src/agents/sandbox-paths.ts)
    • 24.3.4 Docker sandbox images (Dockerfile.sandbox / Dockerfile.sandbox-browser)
    • 24.3.5 Tool allow/deny lists
  • 24.4 Security Auditing
    • 24.4.1 Audit system (src/security/audit.ts)
    • 24.4.2 Filesystem audit (src/security/audit-fs.ts)
    • 24.4.3 Skill security scanner (src/security/skill-scanner.ts)
    • 24.4.4 External content security (src/security/external-content.ts)
    • 24.4.5 openclaw security audit command
  • 24.5 SOUL Security
    • 24.5.1 Malicious SOUL detection (src/hooks/soul-evil.ts)

Chapter 25: CLI Tools

  • 25.1 CLI Architecture
    • 25.1.1 Commander.js command framework (src/cli/program.ts)
    • 25.1.2 Command dispatch (src/cli/run-main.ts)
    • 25.1.3 Command formatting (src/cli/command-format.ts)
  • 25.2 Core Command Analysis
    • 25.2.1 gateway — Start/manage Gateway server
    • 25.2.2 agent — Send messages to AI Agent
    • 25.2.3 send — Send messages via channels
    • 25.2.4 channels — Channel management (login/config)
    • 25.2.5 models — Model management (list/set/scan)
    • 25.2.6 sessions — Session management
    • 25.2.7 skills — Skill management
    • 25.2.8 browser — Browser control
    • 25.2.9 cron — Scheduled task management
    • 25.2.10 nodes — Node management
    • 25.2.11 doctor — Health check and diagnostics
    • 25.2.12 security — Security audit
    • 25.2.13 onboard — Onboarding Wizard
    • 25.2.14 update — Version updates
  • 25.3 Chat Commands (Slash Commands)
    • 25.3.1 Command processing (src/commands/)
    • 25.3.2 /status / /new / /reset / /compact / /think / /verbose / /model

Chapter 26: Infrastructure

  • 26.1 Logging System
    • 26.1.1 Logging framework: tslog (src/logger.ts)
    • 26.1.2 Log levels and output (src/logging.ts / src/logging/)
    • 26.1.3 WebSocket logging (src/gateway/ws-log.ts)
  • 26.2 Media Pipeline
    • 26.2.1 Media fetching (src/media/fetch.ts)
    • 26.2.2 Media parsing (src/media/parse.ts)
    • 26.2.3 Media storage (src/media/store.ts)
    • 26.2.4 Image processing: Sharp library (src/media/image-ops.ts)
    • 26.2.5 Audio processing (src/media/audio.ts)
    • 26.2.6 MIME type detection (src/media/mime.ts)
  • 26.3 Link Understanding and Media Understanding
    • 26.3.1 Link understanding (src/link-understanding/)
    • 26.3.2 Media understanding (src/media-understanding/)
  • 26.4 TTS (Text-to-Speech)
    • 26.4.1 TTS engine (src/tts/)
    • 26.4.2 ElevenLabs / Edge TTS integration
  • 26.5 Polls System
    • 26.5.1 Polls implementation (src/polls.ts)

Part VIII: Client Apps and Web UI

Chapter 27: Web Console UI

Chapter 28: Native Client Apps

  • 28.1 macOS App (OpenClaw.app)
    • 28.1.1 Swift + SwiftUI architecture
    • 28.1.2 Menu Bar control
    • 28.1.3 Voice Wake + Push-to-Talk
    • 28.1.4 Talk Mode overlay
    • 28.1.5 Remote Gateway control
    • 28.1.6 macOS permissions and TCC
  • 28.2 iOS Node App
    • 28.2.1 Bonjour pairing
    • 28.2.2 Canvas surface
    • 28.2.3 Voice Wake / Talk Mode
    • 28.2.4 Camera and screen recording
  • 28.3 Android Node App
    • 28.3.1 Kotlin architecture
    • 28.3.2 Canvas / Camera / Screen recording
    • 28.3.3 Optional SMS support
  • 28.4 Shared Components
    • 28.4.1 apps/shared/OpenClawKit/ — Cross-platform shared code

Part IX: Deployment and Operations

Chapter 29: Deployment

  • 29.1 Local Deployment
    • 29.1.1 npm global installation
    • 29.1.2 Daemon management (launchd / systemd)
    • 29.1.3 openclaw doctor health check
  • 29.2 Docker Deployment
    • 29.2.1 Dockerfile analysis: multi-stage build, non-root user
    • 29.2.2 docker-compose.yml walkthrough
    • 29.2.3 Docker sandbox (Dockerfile.sandbox)
    • 29.2.4 Environment variable configuration
  • 29.3 Remote Access
    • 29.3.1 Tailscale Serve / Funnel configuration (src/gateway/server-tailscale.ts)
    • 29.3.2 SSH tunneling
    • 29.3.3 Service discovery (src/gateway/server-discovery.ts)
    • 29.3.4 Bonjour/mDNS (@homebridge/ciao)
  • 29.4 Nix Declarative Deployment
  • 29.5 VPS Deployment (docs/vps.md)

Chapter 30: Monitoring and Troubleshooting

  • 30.1 Health Checks
    • 30.1.1 Gateway health endpoint
    • 30.1.2 Gateway probe (src/gateway/probe.ts)
    • 30.1.3 Heartbeat mechanism
  • 30.2 Presence Tracking
    • 30.2.1 Online status management
    • 30.2.2 Typing indicators
  • 30.3 Usage Tracking
    • 30.3.1 Token usage statistics (src/agents/usage.ts)
    • 30.3.2 Usage display options: off / tokens / full
  • 30.4 Troubleshooting
    • 30.4.1 openclaw doctor diagnostic flow
    • 30.4.2 Common issues: channel disconnection, model errors, sandbox failures
    • 30.4.3 Log analysis guide

Part X: Advanced Topics and Practice

Chapter 31: Multi-Agent Architecture

  • 31.1 Multi-Agent Design
    • 31.1.1 Agent scope (src/agents/agent-scope.ts)
    • 31.1.2 Agent paths (src/agents/agent-paths.ts)
    • 31.1.3 Agent defaults (src/agents/defaults.ts)
  • 31.2 Sub-Agents
    • 31.2.1 Sub-agent registry (src/agents/subagent-registry.ts)
    • 31.2.2 Sub-agent spawning (sessions_spawn tool)
    • 31.2.3 Sub-agent announcements (src/agents/subagent-announce.ts)
    • 31.2.4 Sub-agent announcement queue (src/agents/subagent-announce-queue.ts)
  • 31.3 Multi-Agent Sandbox Tools
    • 31.3.1 Cross-agent tool isolation
    • 31.3.2 Sandbox agent configuration

Chapter 32: ACP (Agent Communication Protocol)

  • 32.1 ACP Protocol Overview
    • 32.1.1 @agentclientprotocol/sdk dependency
    • 32.1.2 src/acp/ — OpenClaw's ACP implementation
  • 32.2 ACP Session Management and Runtime
    • 32.2.1 Session Store (AcpSessionStore)
    • 32.2.2 Session Key Resolution
    • 32.2.3 Session Resume
    • 32.2.4 Session Identity Tracking
    • 32.2.5 Session Metadata Persistence
    • 32.2.6 Runtime Cache and Eviction
  • 32.3 ACP Control Plane and Integration Patterns
    • 32.3.1 Control Plane Architecture
    • 32.3.2 Provenance Tracking
    • 32.3.3 Tool Streaming Enhancement
    • 32.3.4 Persistent Bindings
    • 32.3.5 Runtime Controls and Configuration
    • 32.3.6 Error Handling and Security

Chapter 33: TUI (Terminal User Interface)

  • 33.1 TUI Architecture
    • 33.1.1 @mariozechner/pi-tui integration
    • 33.1.2 src/tui/ — TUI adaptation layer
    • 33.1.3 Terminal rendering (src/terminal/)
  • 33.2 TUI Interactive Mechanisms
    • 33.2.1 Event Processing Pipeline
    • 33.2.2 Gateway Event Handling
    • 33.2.3 Command System Deep Dive
    • 33.2.4 Session Operations
    • 33.2.5 Local Shell Execution
    • 33.2.6 Keyboard Shortcuts and Input Optimization
  • 33.3 TUI Advanced Features
    • 33.3.1 Text Sanitization and Safe Rendering
    • 33.3.2 Markdown Rendering Pipeline
    • 33.3.3 Tool Execution Visualization
    • 33.3.4 Light/Dark Theme Adaptation
    • 33.3.5 Fuzzy Search and Selector
    • 33.3.6 Performance Optimization Strategies

Chapter 34: Hands-On Project: Build Your Own AI Assistant

  • 34.1 Project Planning
    • 34.1.1 Requirements analysis: MVP of a multi-channel AI assistant
    • 34.1.2 Architecture design reference
    • 34.1.3 Technology selection guide
  • 34.2 Core Feature Implementation
    • 34.2.1 Building a WebSocket Gateway skeleton
    • 34.2.2 Implementing message routing and session management
    • 34.2.3 Integrating LLM APIs (OpenAI / Anthropic)
    • 34.2.4 Implementing basic tool system (bash execution)
  • 34.3 Channel Integration
    • 34.3.1 Implementing a Telegram Bot channel adapter
    • 34.3.2 Implementing a Discord Bot channel adapter
    • 34.3.3 Implementing a WebChat channel
  • 34.4 Advanced Features
    • 34.4.1 Adding a memory system (vector search)
    • 34.4.2 Adding a skill system
    • 34.4.3 Adding Cron scheduling
  • 34.5 Deployment and Launch
    • 34.5.1 Docker containerization
    • 34.5.2 Daemon configuration
    • 34.5.3 Security hardening

Appendices

  • Complete JSON5 configuration example
  • All configuration key quick reference
  • All request methods and parameters
  • All event types and payloads
  • Schema and usage for all built-in tools
  • Key module dependency diagrams
  • Data flow sequence diagram (message → gateway → agent → tool → response → channel)
  • Complete bilingual (Chinese-English) glossary of all technical terms used in this book

Estimated Scope

Part Chapters Estimated Word Count
Usage Guide 3 chapters ~24,000 words
Part I: Global Overview 2 chapters ~8,000 words
Part II: Gateway Control Plane 4 chapters ~16,000 words
Part III: AI Agent Runtime 4 chapters ~16,000 words
Part IV: Multi-Channel Messaging 3 chapters ~12,000 words
Part V: Tool System & Automation 6 chapters ~18,000 words
Part VI: Memory, Skills & Ecosystem 3 chapters ~10,000 words
Part VII: Security, Config & Infra 4 chapters ~12,000 words
Part VIII: Client Apps & Web UI 2 chapters ~6,000 words
Part IX: Deployment & Operations 2 chapters ~6,000 words
Part X: Advanced Topics & Practice 4 chapters ~12,000 words
Appendices 5 items ~4,000 words
Total 37 chapters + 5 appendices ~144,000 words

About

《深入 OpenClaw》。全网第一本介绍 OpenClaw 的书,用 OpenClaw + OpenCode + Opus 4.6 写成。A book introducing OpenClaw, written by OpenClaw + OpenCode + Opus 4.6

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors