Skip to content

feat(jd,taobao,cnki): add JD, Taobao, and CNKI adapters#248

Merged
jackwener merged 1 commit intojackwener:mainfrom
Muuuun:feat/jd-taobao-cnki-adapters
Apr 8, 2026
Merged

feat(jd,taobao,cnki): add JD, Taobao, and CNKI adapters#248
jackwener merged 1 commit intojackwener:mainfrom
Muuuun:feat/jd-taobao-cnki-adapters

Conversation

@Muuuun
Copy link
Copy Markdown
Contributor

@Muuuun Muuuun commented Mar 22, 2026

Summary

  • Add 12 new adapters for JD (京东), Taobao (淘宝), and CNKI (知网)
  • Both JD and Taobao have complete shopping workflows: search, detail, reviews, add-to-cart, cart

Prerequisites

All commands in this PR are browser commands (Strategy.COOKIE). They require:

  1. Browser Bridge 扩展 installed in Chrome
  2. Run commands via opencli in the terminal (not by opening URLs directly in the browser)
  3. Taobao and JD additionally require login in the automation browser window

JD (京东) — 5 commands

Command Description
jd search <query> Product search (title, price, shop, SKU)
jd detail <sku> Product detail (ratings, review tags, shop)
jd reviews <sku> User review extraction
jd add-cart <sku> Add to cart via cart.jd.com/gate.action (--dry-run supported)
jd cart View cart contents via JD cart API

JD uses fully obfuscated CSS classes — extraction uses div[data-sku] attributes and text pattern matching.

Taobao (淘宝) — 5 commands

Command Description
taobao search <query> Product search with sort options (default/sale/price)
taobao detail <id> Product detail (title, price, shop, location)
taobao reviews <id> User review extraction
taobao add-cart <id> Add to cart via button click (--dry-run supported)
taobao cart View cart contents

Taobao uses obfuscated CSS with semantic prefixes (e.g. title--xxx, priceInt--xxx, realSales--xxx). The adapter matches via [class*="prefix--"] selectors. Item IDs are extracted from data-spm-act-id attributes.

Note: Taobao requires login in the automation window.

CNKI (知网) — 1 command

Command Description
cnki search <query> Chinese academic paper search via oversea.cnki.net

Changes since review

Addressed all feedback from @Astro-Han:

  1. Fix JS injection — added /^\d+$/ validation for sku/id before interpolation into page.evaluate; query length validation for search commands
  2. --dry-run for add-cart — both jd add-cart and taobao add-cart now support --dry-run to preview without modifying the cart
  3. JSONP cleanup — refactored taobao/reviews.ts with a settled guard and cleanup() to reliably remove callback + script element on all paths
  4. Two-step navigation comments — all 5 Taobao adapters now explain why goto(taobao.com)location.href is needed (session cookie establishment)
  5. Hardcoded region documentedjd/cart.ts API area parameter documented as known limitation (北京)
  6. E2E tests added — 12 new test cases in browser-auth.test.ts (5 JD + 5 Taobao + 1 CNKI + 1 CNKI graceful failure)

Test plan

  • npx tsc --noEmit — type check passed
  • npx vitest run src/ — all 306 unit tests passed
  • opencli validate — 86 CLI definitions validated, 0 errors
  • JD: search ✅, detail ✅, reviews ✅, add-cart ✅, cart ✅
  • Taobao: search ✅, detail ✅
  • CNKI: search ✅

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@Astro-Han Astro-Han left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive scope — complete shopping workflows for both JD and Taobao, including spec selection. A few concerns, some more serious than others:

page.evaluate injection — user input interpolated into JS strings

Several adapters embed kwargs.sku / kwargs.id directly inside page.evaluate template strings without validation:

// jd/detail.ts
{ field: 'SKU', value: '${kwargs.sku}' }

// taobao/detail.ts
location.href = 'https://item.taobao.com/item.htm?id=${kwargs.id}'

If the value contains a single quote or backtick, it breaks the script; a crafted value can inject arbitrary JS in the page's authenticated context. Since sku and id are always numeric, a simple guard like if (!/^\d+$/.test(kwargs.sku)) throw ... before interpolation would close this.

Affected files: jd/detail.ts, jd/add-cart.ts, taobao/detail.ts, taobao/reviews.ts, taobao/add-cart.ts, taobao/search.ts.

add-cart — write operations with no dry-run

Both jd/add-cart and taobao/add-cart modify real shopping carts on execution. taobao/add-cart also auto-selects the first available spec when --spec is omitted, which could surprise users. Consider adding a --dry-run flag that shows what would be added without committing the action.

taobao/reviews.ts — JSONP script injection

The JSONP callback creates a global window[cbName] and injects a <script> tag, but neither the script element nor the callback is reliably cleaned up on all paths (success, error, timeout). The 10s timeout and callback deletion can also race. Minor, but worth a cleanup pass.

Taobao two-step navigation

All 5 Taobao adapters do goto('https://www.taobao.com')wait(2)evaluate location.href = target. This is presumably to establish session cookies before navigating to item pages. A brief comment explaining the rationale would help future maintainers understand whether this can be simplified.

jd/cart.ts — hardcoded delivery region

The cart API URL includes area=22_1930_50948_52157, which locks prices/availability to a specific region. Worth documenting as a known limitation or making it configurable.

Tests

This is the second consecutive PR (after #243) with zero test coverage. 11 new commands — including 2 write operations — with no E2E entries. Per TESTING.md, browser+auth commands should have entries in browser-auth.test.ts (at minimum verifying graceful failure when not logged in). The add-cart commands especially need test coverage to ensure they don't silently "succeed" in unauthenticated sessions.

@W0rry628
Copy link
Copy Markdown

能否单独上传一个cnki的pr?我看他只回复你了jd和tb的请求

@W0rry628
Copy link
Copy Markdown

你好!我测试完了cnki的效果,发现cnki竟然连搜索网页都是临时的,所以单纯打开https://oversea.cnki.net/kns/search?dbcode=CFLS&kw=${query}&korder=SU
是完全没有用的,会显示404网页不存在,很遗憾
image

@Muuuun
Copy link
Copy Markdown
Contributor Author

Muuuun commented Mar 27, 2026

@W0rry628 谢谢测试!

CNKI adapter 需要通过 opencli + Browser Bridge 扩展在 Chrome 里运行,不能直接在浏览器地址栏打开 URL。因为 CNKI 的搜索结果页是动态渲染的,需要完整的浏览器 JS 环境才能加载。

我刚测了 opencli cnki search "机器学习" --limit 3,正常返回了结果。

使用方式:

  1. Chrome 安装 Browser Bridge 扩展
  2. 终端运行 opencli cnki search "关键词"

PR 描述里之前没写这个前置条件,已经补上了。

@W0rry628
Copy link
Copy Markdown

@Muuuun
抱歉,昨天完整测试完之后发现是我的问题,但是没来得及报告
海外版知网如果检测到国内ip登录会自动跳转到kns(而非oversea)前缀,于是在opencli运行后会返回404,我开启代理进入cnki后才可以正常返回结果,
但是有新的问题,由于代理导致ip变更,无法使用校内登录权限下载文献,关闭代理后原链接便会返回404。似乎陷入了死循环
我暂时也没找到将个人账号与校内账号绑定的方法

@jackwener jackwener force-pushed the feat/jd-taobao-cnki-adapters branch from d8168fe to fc5752b Compare April 8, 2026 18:41
@jackwener jackwener merged commit b097495 into jackwener:main Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants