feat(autoresearch): improve operate success rate + complex publish chains#753
Merged
feat(autoresearch): improve operate success rate + complex publish chains#753
Conversation
New eval-publish.ts tests end-to-end content creation via operate commands: - 7 tasks: 5 fill-only (safe) + 2 publish (post + delete) - Twitter: compose fill, reply fill, post+delete, cross-site HN→tweet - Zhihu: answer fill, article fill (title+body), cross-site HN→answer - Supports --type fill-only/publish and --platform twitter/zhihu filters - Cleanup steps auto-delete published content after verification - fill-only: 5/5 passing
…ains Iteration round 1 results: - Browse: 50/59 → 58/59 (+8) — fixed 8 broken selectors, 1 remaining (DDG images anti-crawl) - Publish fill-only: 5/5 → 12/13 → 13/13 — added 8 complex tasks, fixed selectors - Save as CLI: 26/26 (maintained) Changes: - browse-tasks.json: fix 8 broken selectors (iana, github, quotes, trending, google, wiki, npm, httpbin) - publish-tasks.json: add 8 complex multi-step tasks (thread compose, quote RT, search→reply, cross-platform) - skills/opencli-operate/SKILL.md: add Common Pitfalls section, improve save-as-CLI guidance - Fix twitter thread compose (use querySelectorAll for 2nd textarea) - Fix zhihu editor selectors (WriteIndex-titleInput, contenteditable)
just-buer
pushed a commit
to just-buer/opencli
that referenced
this pull request
Apr 8, 2026
…ains (jackwener#753) * chore(autoresearch): format save-tasks.json * feat(autoresearch): add Layer 5 Publish testing for twitter/zhihu New eval-publish.ts tests end-to-end content creation via operate commands: - 7 tasks: 5 fill-only (safe) + 2 publish (post + delete) - Twitter: compose fill, reply fill, post+delete, cross-site HN→tweet - Zhihu: answer fill, article fill (title+body), cross-site HN→answer - Supports --type fill-only/publish and --platform twitter/zhihu filters - Cleanup steps auto-delete published content after verification - fill-only: 5/5 passing * feat(autoresearch): improve operate success rate + complex publish chains Iteration round 1 results: - Browse: 50/59 → 58/59 (+8) — fixed 8 broken selectors, 1 remaining (DDG images anti-crawl) - Publish fill-only: 5/5 → 12/13 → 13/13 — added 8 complex tasks, fixed selectors - Save as CLI: 26/26 (maintained) Changes: - browse-tasks.json: fix 8 broken selectors (iana, github, quotes, trending, google, wiki, npm, httpbin) - publish-tasks.json: add 8 complex multi-step tasks (thread compose, quote RT, search→reply, cross-platform) - skills/opencli-operate/SKILL.md: add Common Pitfalls section, improve save-as-CLI guidance - Fix twitter thread compose (use querySelectorAll for 2nd textarea) - Fix zhihu editor selectors (WriteIndex-titleInput, contenteditable)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
skills/opencli-operate/SKILL.md: Common Pitfalls section, selector fallback guidance, save-as-CLI workflow docsTest Results
Key changes
autoresearch/eval-publish.ts: publish test harness (fill-only + publish+cleanup)autoresearch/publish-tasks.json: 15 tasks covering twitter/zhihu/cross-platformautoresearch/browse-tasks.json: fix 8 broken selectorsskills/opencli-operate/SKILL.md: Common Pitfalls, selector best practices, wait variantsTest plan
npx tsx autoresearch/eval-browse.ts→ 56-58/59npx tsx autoresearch/eval-save.ts→ 26/26npx tsx autoresearch/eval-publish.ts --type fill-only→ 12-13/13