fix: 为知识库 rerank 增加独立超时保护#1259
Open
GggggitHub wants to merge 1 commit into
Open
Conversation
修复远程 rerank 服务响应过慢时,knowledge_search 工具一直阻塞到整轮工具调用超时的问题。 根因:knowledge_search 复用外层工具执行 context 调用 rerank;embedding 和检索消耗部分时间后,慢 rerank 会占满剩余预算,最终触发 context deadline exceeded,并使后续 LLM fallback 也立即失败。 改动: - internal/agent/tools/knowledge_search.go: 为 rerankWithModel 增加 25s 子 context,限制单次 rerank 上游调用耗时。 - rerank 超时后保持原有降级路径,由外层逻辑记录错误并继续 fallback/返回检索结果,避免占满 60s 工具总超时。 验证: - go test ./internal/agent ./internal/agent/tools ./internal/models/rerank - WeKnora-app -> http://10.67.1.74:18080/v1/rerank 测试 0.6B rerank 正常返回,短文档约 0.69s,长文档约 2.68s。
Author
|
管理员呢?我的PR怎么 merge ? |
Collaborator
|
langfuse中embedding耗时5s+?rerank耗时50s+,这个本身不太正常。通常embedding耗时200ms,rerank 800ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request
描述 (Description)
修复远程 rerank 服务响应过慢时53秒,
knowledge_search工具一直阻塞到整轮工具调用超时的问题。根因:
knowledge_search复用外层工具执行 context 调用 rerank;embedding 和检索消耗部分时间后,慢 rerank 会占满剩余预算,最终触发
context deadline exceeded,并使后续 LLM fallback 也立即失败。改动:
internal/agent/tools/knowledge_search.go: 为rerankWithModel增加 25s 子 context,限制单次 rerank 上游调用耗时。变更类型 (Type of Change)
影响范围 (Scope)
测试 (Testing)
测试步骤 (Test Steps)
运行后端相关测试:
go test ./internal/agent ./internal/agent/tools ./internal/models/rerank使用 WeKnora-app 容器访问远程 rerank 服务,确认接口正常返回:
curl http://10.67.1.74:18080/v1/rerank
使用本地 0.6B rerank 服务验证实际响应时间:短文档约 0.69s,长文档约 2.68s,避免阻塞到 60s 工具总超时。
检查清单 (Checklist)
相关 Issue
Fixes #
截图/录屏 (Screenshots/Recordings)
无。该 PR 仅涉及后端 rerank 调用超时控制,无前端 UI 变更。
数据库迁移 (Database Migration)
配置变更 (Configuration Changes)
无。该 PR 仅增加 rerank 调用的独立超时控制,不引入新的配置项。
部署说明 (Deployment Notes)
正常后端部署即可。该变更不需要额外数据库迁移或配置变更。
其他信息 (Additional Information)
该修复不会改变 rerank 成功时的排序逻辑;仅在远程 rerank 服务响应过慢时更早释放调用,避免拖垮整轮 knowledge_search 工具执行。