docs: align command list with shipped adapters#5
Closed
erixyuan wants to merge 1 commit intojackwener:mainfrom
Closed
docs: align command list with shipped adapters#5erixyuan wants to merge 1 commit intojackwener:mainfrom
erixyuan wants to merge 1 commit intojackwener:mainfrom
Conversation
Owner
|
感谢你的贡献!这个文档更新已经在 |
jackwener
added a commit
that referenced
this pull request
Mar 15, 2026
- #1 Fix URL injection in subtitle.ts via JSON.stringify - #2 Remove debug console.error from production code - #3 Delete stale test_subtitle.ts - #4 Add --lang option for multi-language subtitle selection - #5 Fix duplicate comment numbering (two '// 4.') - #6 Add clickLabels targeted clicking + --click flag to explore - #7 Move empty-value penalty into scoreEndpoint() (affects filtering) - #8 Add cascade request code template to CLI-CREATOR.md
jackwener
added a commit
that referenced
this pull request
Mar 19, 2026
Bug fixes: - #1 /logs?level=error returned 404 — use pathname for route matching - #2 Duplicate initialization — added 'initialized' guard flag Should fix: - #4 Added screenshot() to IPage interface - #5 Graceful shutdown rejects pending requests before exit - #6 Use process.execPath instead of 'npx tsx' for faster daemon spawn Cleanup: - #7 Removed duplicate 'browser' keyword in package.json - #8 Removed unused normalizeEvaluateSource import from browser.ts - #9 Changed dynamic import to static import in intercept.ts - #10 Added explicit throw at end of sendCommand for clarity 61 tests pass (4 test files). Extension: 10.55KB.
jackwener
added a commit
that referenced
this pull request
Mar 30, 2026
…mpt, actions Closes all high and medium priority gaps vs Browser Use: Planning System (#1): - PlanItem state machine (pending/current/done/skipped) - LLM can output `plan` field to update/create plans - Plan auto-advances on successful steps - Replan nudge after 3 consecutive failures Self-Evaluation (#3): - New `evaluationPreviousGoal` field in AgentResponse - Pre-done verification rules in system prompt (5-step checklist) - `success` field on DoneAction for explicit failure signaling Action System (#4): - New actions: select_dropdown, switch_tab, open_tab, close_tab, search_page - Auto-detect <select> and redirect to select_dropdown - Element scroll (scroll within a specific element by index) - Wait capped at 10s Loop Detection (#5): - SHA-256 hashed sliding window (15 steps) - 3 severity tiers: mild (4x), strong (7x), critical (10x) - Page fingerprint stall detection (URL + element count + DOM hash) System Prompt (#6): - Expanded from 65 to ~170 lines with structured sections - Action chaining rules (page-changing vs safe) - Reasoning pattern guidance - Examples for evaluation, memory, planning LLM Timeout (#7): - Configurable `llmTimeout` (default 60s) - Promise-based timeout wrapper Message Compaction (#8): - Builds structured summary of compacted messages - Extracts URLs visited, goals achieved, past errors - Maintains Anthropic API user/assistant alternation AX Tree Enrichment (#9): - Fetches accessibility role/name via CDP when available - Enriches ElementInfo with axRole/axName - Falls back to DOM attributes if CDP unavailable Sensitive Data Masking (#10): - Configurable sensitivePatterns map - Applied to all user messages before LLM Prompt Caching (#2): - System prompt uses cache_control: ephemeral - Last user message uses cache_control: ephemeral - Token tracking includes cache_read and cache_creation Screenshot Control (#11): - Configurable maxScreenshotDim (default 1200px) - Zero-size element filtering in DOM context
jackwener
added a commit
that referenced
this pull request
Mar 30, 2026
…tion, timeout #1 AX tree: remove dead CDP calls (DOM.getDocument + Accessibility.getFullAXTree were called but axLookup never used). Replace with single batched evaluate() that reads ARIA attributes for up to 100 elements in one call. #2 Loop detection: detectLoop() now uses only previously recorded state (no domContext param). Fixes off-by-one where current step wasn't yet recorded. #3 Message compaction: prevent consecutive user messages by merging summary into preceding user message if roles collide, and skipping duplicate roles at the tail boundary. #4 JS injection: all evaluate() calls now use JSON.stringify for user-controlled values (element indices, option text, scroll amounts) instead of template interpolation. #5 updatePlan: moved after consecutiveErrors update so plan advancement uses current step's error state, not the previous step's. #6 LLM timeout: pass AbortController signal to Anthropic SDK so timed-out requests are actually cancelled instead of continuing in the background.
jackwener
added a commit
that referenced
this pull request
Mar 31, 2026
…mpt, actions Closes all high and medium priority gaps vs Browser Use: Planning System (#1): - PlanItem state machine (pending/current/done/skipped) - LLM can output `plan` field to update/create plans - Plan auto-advances on successful steps - Replan nudge after 3 consecutive failures Self-Evaluation (#3): - New `evaluationPreviousGoal` field in AgentResponse - Pre-done verification rules in system prompt (5-step checklist) - `success` field on DoneAction for explicit failure signaling Action System (#4): - New actions: select_dropdown, switch_tab, open_tab, close_tab, search_page - Auto-detect <select> and redirect to select_dropdown - Element scroll (scroll within a specific element by index) - Wait capped at 10s Loop Detection (#5): - SHA-256 hashed sliding window (15 steps) - 3 severity tiers: mild (4x), strong (7x), critical (10x) - Page fingerprint stall detection (URL + element count + DOM hash) System Prompt (#6): - Expanded from 65 to ~170 lines with structured sections - Action chaining rules (page-changing vs safe) - Reasoning pattern guidance - Examples for evaluation, memory, planning LLM Timeout (#7): - Configurable `llmTimeout` (default 60s) - Promise-based timeout wrapper Message Compaction (#8): - Builds structured summary of compacted messages - Extracts URLs visited, goals achieved, past errors - Maintains Anthropic API user/assistant alternation AX Tree Enrichment (#9): - Fetches accessibility role/name via CDP when available - Enriches ElementInfo with axRole/axName - Falls back to DOM attributes if CDP unavailable Sensitive Data Masking (#10): - Configurable sensitivePatterns map - Applied to all user messages before LLM Prompt Caching (#2): - System prompt uses cache_control: ephemeral - Last user message uses cache_control: ephemeral - Token tracking includes cache_read and cache_creation Screenshot Control (#11): - Configurable maxScreenshotDim (default 1200px) - Zero-size element filtering in DOM context
jackwener
added a commit
that referenced
this pull request
Mar 31, 2026
…tion, timeout #1 AX tree: remove dead CDP calls (DOM.getDocument + Accessibility.getFullAXTree were called but axLookup never used). Replace with single batched evaluate() that reads ARIA attributes for up to 100 elements in one call. #2 Loop detection: detectLoop() now uses only previously recorded state (no domContext param). Fixes off-by-one where current step wasn't yet recorded. #3 Message compaction: prevent consecutive user messages by merging summary into preceding user message if roles collide, and skipping duplicate roles at the tail boundary. #4 JS injection: all evaluate() calls now use JSON.stringify for user-controlled values (element indices, option text, scroll amounts) instead of template interpolation. #5 updatePlan: moved after consecutiveErrors update so plan advancement uses current step's error state, not the previous step's. #6 LLM timeout: pass AbortController signal to Anthropic SDK so timed-out requests are actually cancelled instead of continuing in the background.
jackwener
added a commit
that referenced
this pull request
Apr 2, 2026
…e turns - Add Rule #7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule #5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)
jackwener
added a commit
that referenced
this pull request
Apr 3, 2026
…timization) (#717) * feat: AutoResearch framework + V2EX test suite (40 tasks) AutoResearch framework (Karpathy-style autonomous iteration): - engine.ts: 8-phase loop (review → modify → commit → verify → guard → decide → log) - config.ts: typed config + CLI parser + metric extraction - logger.ts: TSV append-only results log - commands/run.ts: main loop spawning Claude Code per iteration - commands/plan.ts: interactive config wizard - commands/fix.ts: auto-detect broken state, iteratively fix - commands/debug.ts: hypothesis-driven debugging for failing tasks V2EX test suite (5 layers, 40 tasks): - L1 Atomic (10): open, state, click, scroll, eval, back, wait - L2 Single Page (10): hot topics, node list, topic meta, pagination - L3 Multi-Step (10): click-read, navigate-node, tab-then-topic, pagination - L4 Write Ops (5): reply typing, favorite detection, form detection - L5 Complex Chain (5): cross-page collect, multi-node compare, full workflow Presets: operate-reliability, skill-quality, v2ex-reliability * test: V2EX test suite 60/60 — fix selectors, add harder tasks - Fix v2ex-collect-hot-authors selector (pathname-based member link detection) - Fix v2ex-wait-text judge (accept "appeared") - Fix trailing commas in eval step strings - Add 20 harder tasks: state+click interaction + long chain workflows - Baseline: 60/60 across all layers * docs: optimize SKILL.md for efficiency — aggressive chaining, minimize turns - Add Rule #7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule #5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)
just-buer
pushed a commit
to just-buer/opencli
that referenced
this pull request
Apr 8, 2026
…timization) (jackwener#717) * feat: AutoResearch framework + V2EX test suite (40 tasks) AutoResearch framework (Karpathy-style autonomous iteration): - engine.ts: 8-phase loop (review → modify → commit → verify → guard → decide → log) - config.ts: typed config + CLI parser + metric extraction - logger.ts: TSV append-only results log - commands/run.ts: main loop spawning Claude Code per iteration - commands/plan.ts: interactive config wizard - commands/fix.ts: auto-detect broken state, iteratively fix - commands/debug.ts: hypothesis-driven debugging for failing tasks V2EX test suite (5 layers, 40 tasks): - L1 Atomic (10): open, state, click, scroll, eval, back, wait - L2 Single Page (10): hot topics, node list, topic meta, pagination - L3 Multi-Step (10): click-read, navigate-node, tab-then-topic, pagination - L4 Write Ops (5): reply typing, favorite detection, form detection - L5 Complex Chain (5): cross-page collect, multi-node compare, full workflow Presets: operate-reliability, skill-quality, v2ex-reliability * test: V2EX test suite 60/60 — fix selectors, add harder tasks - Fix v2ex-collect-hot-authors selector (pathname-based member link detection) - Fix v2ex-wait-text judge (accept "appeared") - Fix trailing commas in eval step strings - Add 20 harder tasks: state+click interaction + long chain workflows - Baseline: 60/60 across all layers * docs: optimize SKILL.md for efficiency — aggressive chaining, minimize turns - Add Rule jackwener#7: minimize total tool calls (3-5 per task, not 15-20) - Strengthen Rule jackwener#5: chain aggressively with && - Add explicit good/bad chaining examples - Add click+wait+state chaining pattern - Add type+verify chaining pattern Before: 21 turns for complex V2EX reply task After: 12 turns for same task (-43% turns, -28% cost)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
27commands across15sites)githubcommand references that are not present in the current repo/runtimeSKILL.mdto use commands that actually existVerification
node dist/main.js list --jsonsrc/clis/github/adapter directory in the repo