Skip to content

Conversation

@hllshiro
Copy link
Owner

Pull Request Description (中文)

你的功能请求是否与某个问题有关?请描述一下。
请对问题进行清晰扼要的描述。

  • 增加了内置知识库实现,允许用户在本地创建、管理和检索知识,为 DeepChat 带来了离线环境下的 RAG (Retrieval-Augmented Generation) 能力。
  • 优化了部分ollama相关的逻辑。
  • 将FilePresenter的getContent设置为public,用于文本原始内容提取。
  • 修改了大量i18n相关内容。

桌面应用程序的 UI/UX 更改
如果此 PR 引入了 UI/UX 更改,请详细描述它们。
image
image
image
image

平台兼容性注意事项
如果此 PR 具有特定的平台兼容性考虑因素(Windows、macOS、Linux),请在此处描述。

  • 是否有任何平台特定的行为或代码调整?duckdb 的平台兼容性问题
  • 你是否已在所有相关平台上进行过测试?否,仅测试windows平台

新增依赖

  • @duckdb/node-api: 用于本地向量数据库。
  • @langchain/core & @langchain/textsplitters: 用于文本处理和分块。
  • dayjs: 用于日期时间格式化。
  • scripts/postinstall.js: 增加了 duckdb 扩展的安装脚本,在安装依赖时将 vss 扩展拷贝到 runtime 目录下,构建时作为额外资源打包,避免运行时重复下载。

已知问题

  • “重排序模型“ 目前未实现相关逻辑,选项暂时隐藏
  • 文本切分方式只有一种,后续会扩展
  • 上传文件后,文档处理过程中关闭应用,重启后无法恢复任务,后续会优化
  • 目前的相似度算法可能不适用于所有向量模型,导致出现远大于1的相似度值,后续会优化

Summary by CodeRabbit

  • New Features

    • Introduced a built-in knowledge base system for local storage, retrieval, and management of knowledge files.
    • Added a user interface for configuring, uploading, searching, and managing built-in knowledge bases and files.
    • Enabled similarity search within uploaded knowledge files using local embedding models and vector database.
    • Provided auto-detection of embedding dimensions and normalization options.
    • Integrated support for multiple languages in the user interface and settings.
  • Enhancements

    • Added advanced configuration options for chunking, overlap, and fragment numbers in knowledge base settings.
    • Improved file status tracking and real-time updates during upload and processing.
    • Centralized MIME type icon mapping for file display consistency.
  • Bug Fixes

    • Corrected filtering logic in model selection and improved error handling for file operations.
  • Documentation

    • Added comprehensive documentation for the built-in knowledge base architecture, design, and workflows.
  • Localization

    • Extended localization support for built-in knowledge base features across multiple languages.

hllshiro and others added 30 commits June 18, 2025 15:46
…hance BuiltinKnowledgeSettings with URL query parameter handling
…rs and loading logic for better user experience
… model listing with additional configuration properties
hllshiro added 28 commits July 11, 2025 09:06
…-knowledge

# Conflicts:
#	package.json
#	src/main/index.ts
#	src/main/presenter/index.ts
#	src/main/presenter/llmProviderPresenter/index.ts
#	src/renderer/src/events.ts
…eferences and enhancing model info structure
@hllshiro hllshiro closed this Jul 22, 2025
@hllshiro hllshiro deleted the pre-merge-feat-builtin-knowledge branch July 25, 2025 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants