docs: align command list with shipped adapters by erixyuan · Pull Request #5 · jackwener/OpenCLI

erixyuan · 2026-03-15T05:41:42Z

Summary

align README counts with the current shipped adapters (27 commands across 15 sites)
remove github command references that are not present in the current repo/runtime
update the public API examples in SKILL.md to use commands that actually exist

Verification

ran node dist/main.js list --json
confirmed the current runtime exposes 27 commands across 15 sites
confirmed there is no src/clis/github/ adapter directory in the repo

jackwener · 2026-03-15T07:44:36Z

感谢你的贡献！这个文档更新已经在 v0.2.0 版本的重构与文档统筹更新中被包含进去了。为了感谢你的付出，我已经将你设置为了相关 commit 的 Co-authored-by。🎉

- #1 Fix URL injection in subtitle.ts via JSON.stringify - #2 Remove debug console.error from production code - #3 Delete stale test_subtitle.ts - #4 Add --lang option for multi-language subtitle selection - #5 Fix duplicate comment numbering (two '// 4.') - #6 Add clickLabels targeted clicking + --click flag to explore - #7 Move empty-value penalty into scoreEndpoint() (affects filtering) - #8 Add cascade request code template to CLI-CREATOR.md

Bug fixes: - #1 /logs?level=error returned 404 — use pathname for route matching - #2 Duplicate initialization — added 'initialized' guard flag Should fix: - #4 Added screenshot() to IPage interface - #5 Graceful shutdown rejects pending requests before exit - #6 Use process.execPath instead of 'npx tsx' for faster daemon spawn Cleanup: - #7 Removed duplicate 'browser' keyword in package.json - #8 Removed unused normalizeEvaluateSource import from browser.ts - #9 Changed dynamic import to static import in intercept.ts - #10 Added explicit throw at end of sendCommand for clarity 61 tests pass (4 test files). Extension: 10.55KB.

…mpt, actions Closes all high and medium priority gaps vs Browser Use: Planning System (#1): - PlanItem state machine (pending/current/done/skipped) - LLM can output `plan` field to update/create plans - Plan auto-advances on successful steps - Replan nudge after 3 consecutive failures Self-Evaluation (#3): - New `evaluationPreviousGoal` field in AgentResponse - Pre-done verification rules in system prompt (5-step checklist) - `success` field on DoneAction for explicit failure signaling Action System (#4): - New actions: select_dropdown, switch_tab, open_tab, close_tab, search_page - Auto-detect <select> and redirect to select_dropdown - Element scroll (scroll within a specific element by index) - Wait capped at 10s Loop Detection (#5): - SHA-256 hashed sliding window (15 steps) - 3 severity tiers: mild (4x), strong (7x), critical (10x) - Page fingerprint stall detection (URL + element count + DOM hash) System Prompt (#6): - Expanded from 65 to ~170 lines with structured sections - Action chaining rules (page-changing vs safe) - Reasoning pattern guidance - Examples for evaluation, memory, planning LLM Timeout (#7): - Configurable `llmTimeout` (default 60s) - Promise-based timeout wrapper Message Compaction (#8): - Builds structured summary of compacted messages - Extracts URLs visited, goals achieved, past errors - Maintains Anthropic API user/assistant alternation AX Tree Enrichment (#9): - Fetches accessibility role/name via CDP when available - Enriches ElementInfo with axRole/axName - Falls back to DOM attributes if CDP unavailable Sensitive Data Masking (#10): - Configurable sensitivePatterns map - Applied to all user messages before LLM Prompt Caching (#2): - System prompt uses cache_control: ephemeral - Last user message uses cache_control: ephemeral - Token tracking includes cache_read and cache_creation Screenshot Control (#11): - Configurable maxScreenshotDim (default 1200px) - Zero-size element filtering in DOM context

…tion, timeout #1 AX tree: remove dead CDP calls (DOM.getDocument + Accessibility.getFullAXTree were called but axLookup never used). Replace with single batched evaluate() that reads ARIA attributes for up to 100 elements in one call. #2 Loop detection: detectLoop() now uses only previously recorded state (no domContext param). Fixes off-by-one where current step wasn't yet recorded. #3 Message compaction: prevent consecutive user messages by merging summary into preceding user message if roles collide, and skipping duplicate roles at the tail boundary. #4 JS injection: all evaluate() calls now use JSON.stringify for user-controlled values (element indices, option text, scroll amounts) instead of template interpolation. #5 updatePlan: moved after consecutiveErrors update so plan advancement uses current step's error state, not the previous step's. #6 LLM timeout: pass AbortController signal to Anthropic SDK so timed-out requests are actually cancelled instead of continuing in the background.

…mpt, actions Closes all high and medium priority gaps vs Browser Use: Planning System (#1): - PlanItem state machine (pending/current/done/skipped) - LLM can output `plan` field to update/create plans - Plan auto-advances on successful steps - Replan nudge after 3 consecutive failures Self-Evaluation (#3): - New `evaluationPreviousGoal` field in AgentResponse - Pre-done verification rules in system prompt (5-step checklist) - `success` field on DoneAction for explicit failure signaling Action System (#4): - New actions: select_dropdown, switch_tab, open_tab, close_tab, search_page - Auto-detect <select> and redirect to select_dropdown - Element scroll (scroll within a specific element by index) - Wait capped at 10s Loop Detection (#5): - SHA-256 hashed sliding window (15 steps) - 3 severity tiers: mild (4x), strong (7x), critical (10x) - Page fingerprint stall detection (URL + element count + DOM hash) System Prompt (#6): - Expanded from 65 to ~170 lines with structured sections - Action chaining rules (page-changing vs safe) - Reasoning pattern guidance - Examples for evaluation, memory, planning LLM Timeout (#7): - Configurable `llmTimeout` (default 60s) - Promise-based timeout wrapper Message Compaction (#8): - Builds structured summary of compacted messages - Extracts URLs visited, goals achieved, past errors - Maintains Anthropic API user/assistant alternation AX Tree Enrichment (#9): - Fetches accessibility role/name via CDP when available - Enriches ElementInfo with axRole/axName - Falls back to DOM attributes if CDP unavailable Sensitive Data Masking (#10): - Configurable sensitivePatterns map - Applied to all user messages before LLM Prompt Caching (#2): - System prompt uses cache_control: ephemeral - Last user message uses cache_control: ephemeral - Token tracking includes cache_read and cache_creation Screenshot Control (#11): - Configurable maxScreenshotDim (default 1200px) - Zero-size element filtering in DOM context

…tion, timeout #1 AX tree: remove dead CDP calls (DOM.getDocument + Accessibility.getFullAXTree were called but axLookup never used). Replace with single batched evaluate() that reads ARIA attributes for up to 100 elements in one call. #2 Loop detection: detectLoop() now uses only previously recorded state (no domContext param). Fixes off-by-one where current step wasn't yet recorded. #3 Message compaction: prevent consecutive user messages by merging summary into preceding user message if roles collide, and skipping duplicate roles at the tail boundary. #4 JS injection: all evaluate() calls now use JSON.stringify for user-controlled values (element indices, option text, scroll amounts) instead of template interpolation. #5 updatePlan: moved after consecutiveErrors update so plan advancement uses current step's error state, not the previous step's. #6 LLM timeout: pass AbortController signal to Anthropic SDK so timed-out requests are actually cancelled instead of continuing in the background.

…e turns - Add Rule #7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule #5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)

…timization) (#717) * feat: AutoResearch framework + V2EX test suite (40 tasks) AutoResearch framework (Karpathy-style autonomous iteration): - engine.ts: 8-phase loop (review → modify → commit → verify → guard → decide → log) - config.ts: typed config + CLI parser + metric extraction - logger.ts: TSV append-only results log - commands/run.ts: main loop spawning Claude Code per iteration - commands/plan.ts: interactive config wizard - commands/fix.ts: auto-detect broken state, iteratively fix - commands/debug.ts: hypothesis-driven debugging for failing tasks V2EX test suite (5 layers, 40 tasks): - L1 Atomic (10): open, state, click, scroll, eval, back, wait - L2 Single Page (10): hot topics, node list, topic meta, pagination - L3 Multi-Step (10): click-read, navigate-node, tab-then-topic, pagination - L4 Write Ops (5): reply typing, favorite detection, form detection - L5 Complex Chain (5): cross-page collect, multi-node compare, full workflow Presets: operate-reliability, skill-quality, v2ex-reliability * test: V2EX test suite 60/60 — fix selectors, add harder tasks - Fix v2ex-collect-hot-authors selector (pathname-based member link detection) - Fix v2ex-wait-text judge (accept "appeared") - Fix trailing commas in eval step strings - Add 20 harder tasks: state+click interaction + long chain workflows - Baseline: 60/60 across all layers * docs: optimize SKILL.md for efficiency — aggressive chaining, minimize turns - Add Rule #7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule #5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)

…timization) (jackwener#717) * feat: AutoResearch framework + V2EX test suite (40 tasks) AutoResearch framework (Karpathy-style autonomous iteration): - engine.ts: 8-phase loop (review → modify → commit → verify → guard → decide → log) - config.ts: typed config + CLI parser + metric extraction - logger.ts: TSV append-only results log - commands/run.ts: main loop spawning Claude Code per iteration - commands/plan.ts: interactive config wizard - commands/fix.ts: auto-detect broken state, iteratively fix - commands/debug.ts: hypothesis-driven debugging for failing tasks V2EX test suite (5 layers, 40 tasks): - L1 Atomic (10): open, state, click, scroll, eval, back, wait - L2 Single Page (10): hot topics, node list, topic meta, pagination - L3 Multi-Step (10): click-read, navigate-node, tab-then-topic, pagination - L4 Write Ops (5): reply typing, favorite detection, form detection - L5 Complex Chain (5): cross-page collect, multi-node compare, full workflow Presets: operate-reliability, skill-quality, v2ex-reliability * test: V2EX test suite 60/60 — fix selectors, add harder tasks - Fix v2ex-collect-hot-authors selector (pathname-based member link detection) - Fix v2ex-wait-text judge (accept "appeared") - Fix trailing commas in eval step strings - Add 20 harder tasks: state+click interaction + long chain workflows - Baseline: 60/60 across all layers * docs: optimize SKILL.md for efficiency — aggressive chaining, minimize turns - Add Rule jackwener#7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule jackwener#5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)

docs: align command list with shipped adapters

3484c19

jackwener closed this Mar 15, 2026

jackwener mentioned this pull request Apr 3, 2026

feat: AutoResearch framework + V2EX/Zhihu test suites (194/194) #731

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: align command list with shipped adapters#5

docs: align command list with shipped adapters#5
erixyuan wants to merge 1 commit intojackwener:mainfrom
erixyuan:main

erixyuan commented Mar 15, 2026

Uh oh!

jackwener commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erixyuan commented Mar 15, 2026

Summary

Verification

Uh oh!

jackwener commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants