-
Notifications
You must be signed in to change notification settings - Fork 13
Fix#28 #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix#28 #31
Conversation
Reviewer's GuideThis PR adds Chinese documentation and overhauls the PyPI search implementation in PackageManager: switching from HTML scraping to the PyPI Simple v1 JSON API, adding compatibility handling, match-priority sorting, pagination, and parallel metadata fetching with cancellation support. Sequence diagram for new PyPI search process with cancellation and prioritizationsequenceDiagram
actor User
participant "PackageManager"
participant "PyPI Simple API"
participant "PyPI Metadata API"
User->>PackageManager: searchFromPyPi(keyword, page, cancelToken)
PackageManager->>"PyPI Simple API": GET /simple/ (Accept: vnd.pypi.simple.v1+json)
"PyPI Simple API"-->>PackageManager: JSON index of projects
PackageManager->>PackageManager: Filter and sort projects by match priority
PackageManager->>"PyPI Metadata API": parallel GET /pypi/<name>/json for each result
"PyPI Metadata API"-->>PackageManager: Metadata for each package
PackageManager->>User: Return paginated, prioritized results
User-->>PackageManager: (optional) Cancel search
PackageManager->>"PyPI Simple API": Cancel outstanding requests
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Pulling the entire /simple index on each search may be too heavy—consider using PyPI’s official JSON search API or caching the simple index to avoid performance bottlenecks.
- You’re firing off parallel axios requests for every package on the page; introduce a concurrency limit (or make pageSize a configurable constant) to avoid overwhelming the network.
- The change from necessaryPackage.includes(name) to indexOf !== -1 reduces readability—consider reverting to includes() for clearer intent.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Pulling the entire /simple index on each search may be too heavy—consider using PyPI’s official JSON search API or caching the simple index to avoid performance bottlenecks.
- You’re firing off parallel axios requests for every package on the page; introduce a concurrency limit (or make pageSize a configurable constant) to avoid overwhelming the network.
- The change from necessaryPackage.includes(name) to indexOf !== -1 reduces readability—consider reverting to includes() for clearer intent.
## Individual Comments
### Comment 1
<location> `src/modules/PackageManager.ts:320-329` </location>
<code_context>
+ // indexResp.data 可能是 { projects: [...] } 或数组 - 两种格式都做兼容处理
</code_context>
<issue_to_address>
**suggestion:** The compatibility logic for indexResp.data is broad and may mask unexpected API changes.
Consider adding a warning or log when indexResp.data does not match the expected format to help detect upstream API changes early.
</issue_to_address>
### Comment 2
<location> `src/modules/PackageManager.ts:377-386` </location>
<code_context>
+ await Promise.all(pageItems.map(async (name) => {
</code_context>
<issue_to_address>
**suggestion (performance):** Unthrottled parallel requests for package metadata may hit rate limits or degrade performance.
Implement a concurrency limit or throttling to reduce the risk of rate limiting and performance issues, particularly if pageSize increases.
Suggested implementation:
```typescript
import pLimit from 'p-limit';
const pageSize = 20; // 每页大小,保持合理以防并发请求过多
const totalPages = Math.max(1, Math.ceil(matched.length / pageSize));
const currentPage = Math.max(1, Math.min(page, totalPages));
const start = (currentPage - 1) * pageSize;
const pageItems = matched.slice(start, start + pageSize);
// 为当前页的包并发获取元数据(支持取消),限制最大并发数以防止被限流
const list: PackagePickItem[] = [];
const limit = pLimit(5); // 最大并发数可根据实际情况调整
await Promise.all(pageItems.map((name) =>
limit(async () => {
// 如果外部触发了取消(CancellationToken),axiosCancelToken.token.reason 会被设置
if (axiosCancelToken.token.reason) { return; }
try {
const metaResp = await axios.get(`https://pypi.org/pypi/${encodeURIComponent(name)}/json`, {
cancelToken: axiosCancelToken.token,
});
const info = metaResp.data && metaResp.data.info ? metaResp.data.info : undefined;
const version = info?.version || '';
const summary = info?.summary || '';
list.push({
} catch (err) {
// 错误处理逻辑(可选)
}
})
));
```
1. You may need to install `p-limit` if it's not already present: `npm install p-limit`.
2. If your project uses ES modules, ensure the import syntax matches your configuration.
3. If you want to customize the concurrency, adjust the value passed to `pLimit`.
</issue_to_address>
### Comment 3
<location> `src/modules/PackageManager.ts:379` </location>
<code_context>
+ const list: PackagePickItem[] = [];
+ await Promise.all(pageItems.map(async (name) => {
+ // 如果外部触发了取消(CancellationToken),axiosCancelToken.token.reason 会被设置
+ if (axiosCancelToken.token.reason) { return; }
+ try {
+ const metaResp = await axios.get(`https://pypi.org/pypi/${encodeURIComponent(name)}/json`, {
</code_context>
<issue_to_address>
**issue (bug_risk):** Checking axiosCancelToken.token.reason for cancellation may not be robust across axios versions.
This approach may break in future axios releases. Use axios's documented cancellation methods, such as catching axios.Cancel, for consistent behavior.
</issue_to_address>
### Comment 4
<location> `src/modules/PackageManager.ts:336-150` </location>
<code_context>
+ const info = metaResp.data && metaResp.data.info ? metaResp.data.info : undefined;
+ const version = info?.version || '';
+ const summary = info?.summary || '';
+ list.push({
+ name,
+ version,
+ alwaysShow: true,
+ label: name,
+ description: `${version}`,
+ detail: summary,
+ });
+ } catch (err) {
+ // 如果单个包的元数据请求出错,则退回到最小信息,避免整个搜索失败
</code_context>
<issue_to_address>
**suggestion:** Fallback to minimal info on metadata fetch failure is good, but error handling could be more granular.
Please handle cancellation errors separately from other errors to avoid adding incomplete items to the list when a request is cancelled.
</issue_to_address>
### Comment 5
<location> `src/modules/PackageManager.ts:328-333` </location>
<code_context>
} else if (indexResp.data && typeof indexResp.data === 'object') {
// 回退处理:尝试从 entries 字段中提取 name
if (Array.isArray(indexResp.data.entries)) {
projects = indexResp.data.entries.map((p: any) => (p && p.name) || '').filter(Boolean);
}
}
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/merge-nested-ifs))
```suggestion
} else if (indexResp.data && typeof indexResp.data === 'object' && Array.isArray(indexResp.data.entries)) {
projects = indexResp.data.entries.map((p: any) => (p && p.name) || '').filter(Boolean);
}
```
<br/><details><summary>Explanation</summary>Reading deeply nested conditional code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
src/modules/PackageManager.ts
Outdated
| // indexResp.data 可能是 { projects: [...] } 或数组 - 两种格式都做兼容处理 | ||
| let projects: string[] = []; | ||
| try { | ||
| if (Array.isArray(indexResp.data)) { | ||
| // 有些服务器可能直接返回包名字符串数组或对象数组 | ||
| projects = indexResp.data.map((p: any) => (typeof p === 'string' ? p : p.name)).filter(Boolean); | ||
| } else if (Array.isArray(indexResp.data.projects)) { | ||
| projects = indexResp.data.projects.map((p: any) => (typeof p === 'string' ? p : p.name)).filter(Boolean); | ||
| } else if (indexResp.data && typeof indexResp.data === 'object') { | ||
| // 回退处理:尝试从 entries 字段中提取 name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: The compatibility logic for indexResp.data is broad and may mask unexpected API changes.
Consider adding a warning or log when indexResp.data does not match the expected format to help detect upstream API changes early.
| await Promise.all(pageItems.map(async (name) => { | ||
| // 如果外部触发了取消(CancellationToken),axiosCancelToken.token.reason 会被设置 | ||
| if (axiosCancelToken.token.reason) { return; } | ||
| try { | ||
| const metaResp = await axios.get(`https://pypi.org/pypi/${encodeURIComponent(name)}/json`, { | ||
| cancelToken: axiosCancelToken.token, | ||
| }); | ||
| const info = metaResp.data && metaResp.data.info ? metaResp.data.info : undefined; | ||
| const version = info?.version || ''; | ||
| const summary = info?.summary || ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Unthrottled parallel requests for package metadata may hit rate limits or degrade performance.
Implement a concurrency limit or throttling to reduce the risk of rate limiting and performance issues, particularly if pageSize increases.
Suggested implementation:
import pLimit from 'p-limit';
const pageSize = 20; // 每页大小,保持合理以防并发请求过多
const totalPages = Math.max(1, Math.ceil(matched.length / pageSize));
const currentPage = Math.max(1, Math.min(page, totalPages));
const start = (currentPage - 1) * pageSize;
const pageItems = matched.slice(start, start + pageSize);
// 为当前页的包并发获取元数据(支持取消),限制最大并发数以防止被限流
const list: PackagePickItem[] = [];
const limit = pLimit(5); // 最大并发数可根据实际情况调整
await Promise.all(pageItems.map((name) =>
limit(async () => {
// 如果外部触发了取消(CancellationToken),axiosCancelToken.token.reason 会被设置
if (axiosCancelToken.token.reason) { return; }
try {
const metaResp = await axios.get(`https://pypi.org/pypi/${encodeURIComponent(name)}/json`, {
cancelToken: axiosCancelToken.token,
});
const info = metaResp.data && metaResp.data.info ? metaResp.data.info : undefined;
const version = info?.version || '';
const summary = info?.summary || '';
list.push({
} catch (err) {
// 错误处理逻辑(可选)
}
})
));- You may need to install
p-limitif it's not already present:npm install p-limit. - If your project uses ES modules, ensure the import syntax matches your configuration.
- If you want to customize the concurrency, adjust the value passed to
pLimit.
| const list: PackagePickItem[] = []; | ||
| await Promise.all(pageItems.map(async (name) => { | ||
| // 如果外部触发了取消(CancellationToken),axiosCancelToken.token.reason 会被设置 | ||
| if (axiosCancelToken.token.reason) { return; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Checking axiosCancelToken.token.reason for cancellation may not be robust across axios versions.
This approach may break in future axios releases. Use axios's documented cancellation methods, such as catching axios.Cancel, for consistent behavior.
| this.output.appendLine(data); | ||
| errMsg += data; | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Fallback to minimal info on metadata fetch failure is good, but error handling could be more granular.
Please handle cancellation errors separately from other errors to avoid adding incomplete items to the list when a request is cancelled.

加入中文readme,采用pypi_simple的api解决无法搜索的问题,同时优化搜索结果顺序--匹配度优先。
Add Chinese READMEs, use the pypi_simple API to solve the search failure issue, and optimize the order of search results with priority given to matching degree.
Summary by Sourcery
Use PyPI Simple v1 JSON API for more reliable package search with pagination, cancellation, and ordered results; add Chinese documentation.
New Features:
Enhancements:
Documentation: