Refactor: text2germlin with PCgraph framework by fantasy-lotus · Pull Request #50 · hugegraph/hugegraph-ai

fantasy-lotus · 2025-09-27T09:46:02Z

Summary by CodeRabbit

新功能
- 文本转 Gremlin 流程接入调度器，UI 可按需返回模板/原始查询与执行结果
- 新增管理员登录配置项（启用开关、用户/管理员令牌）
- RAG/GraphRAG 请求与配置新增重排方式、近邻优先、自定义优先级等字段；新增 Gremlin 生成请求与输出类型
改进
- text2gremlin 为空查询时返回 400
- 中间件新增 X-Process-Time 响应头，长耗时按秒显示
- 关键词抽取过滤单字符噪声
- 扩充图与 LLM 配置项与默认值
样式/杂项
- 统一格式与日志输出，代码可读性提升（不影响行为）

coderabbitai · 2025-09-27T09:46:08Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

本次变更在现有 RAG/演示代码大量格式化调整的同时，引入了“Text2Gremlin”新流程：新增流与节点、扩展状态模型、接入调度器，并在 API 与 Gradio UI 中使用该流程；同时扩展多项配置与请求模型字段，少量行为性改动包含输入校验、执行结果回传及关键词抽取过滤。

Changes

Cohort / File(s)	Summary
Text2Gremlin 流程与调度接入 `.../flows/text2gremlin.py`, `.../flows/scheduler.py`, `.../api/rag_api.py`, `.../state/ai_state.py`	新增 Text2GremlinFlow（准备/构建/后处理），在 SchedulerSingleton 中注册 text2gremlin 管线并在复用路径前重置状态；API 增加空查询 400 校验并通过调度器执行；WkFlowInput/State 扩展查询、示例数、输出选择及执行结果字段。
Text2Gremlin 所需节点 `.../nodes/index_node/gremlin_example_index_query.py`, `.../nodes/llm_node/text2gremlin.py`, `.../nodes/hugegraph_node/gremlin_execute.py`, `.../nodes/hugegraph_node/schema.py`	新增示例索引查询节点、文本转 Gremlin 生成节点与 Gremlin 执行节点；SchemaNode 将 print 替换为日志。节点按请求输出键选择性执行与填充结果。
Demo UI 对接 Text2Gremlin `.../demo/rag_demo/text2gremlin_block.py`, `.../demo/rag_demo/vector_graph_block.py`, `.../demo/rag_demo/rag_block.py`, `.../demo/rag_demo/*.py`	UI 通过 SchedulerSingleton 调度 Text2Gremlin，增加选择性输出聚合方法与错误处理；其余多为样式/换行调整与轻量日志变更。
请求/模型扩展 `.../api/models/rag_requests.py`	为 RAGRequest、GraphRAGRequest 增加 rerank_method、near_neighbor_first、custom_priority_info；新增 GremlinOutputType 枚举与 GremlinGenerateRequest；若干字段/装饰器格式化与默认值字符串风格调整。
配置扩展 `.../config/llm_config.py`, `.../config/hugegraph_config.py`, `.../config/admin_config.py`, `.../config/generate.py`, `.../config/models/*.py`, `.../config/prompt_config.py`	LLMConfig 增加多类型与 Base URL 字段；HugeGraphConfig 增加连接/查询/向量/重排相关属性；新增 AdminConfig（enable_login/user_token/admin_token）；其余为格式化与字符串风格统一。
中间件与装饰器 `.../middleware/middleware.py`, `.../utils/decorators.py`	中间件在耗时>1000ms 时改用秒并写入 X-Process-Time 响应头；with_task_id 装饰器为调用生成短 task_id、日志记录，并尝试附加到返回闭包。
Graph/Vector 索引与操作 `.../indices/graph_index.py`, `.../indices/vector_index.py`, `.../operators/index_op/`, `.../utils/index_utils.py`, `.../flows/get_graph_index_info.py`	GraphIndex 修正 init 缺失 self；其余为调用格式与日志格式调整、小量本地导入与轻微日志删减，无行为变更。
LLM/Embedding/Reranker 实现 `.../models/llms/`, `.../models/embeddings/`, `.../models/rerankers/`, `.../operators/llm_op/`	主要为签名/换行/日志格式统一；Cohere/SiliconFlow 局部本地导入 Constants；KeywordExtract 排除长度≤1 的关键词（功能性差异）。
API 与异常 `.../api/admin_api.py`, `.../api/exceptions/rag_exceptions.py`	仅构造/换行格式化，无逻辑修改。
HugeGraph Python Client `hugegraph-python-client/src/pyhugegraph/*/`	多数为方法签名单行化、装饰器/partial 调用换行调整与示例脚本精简；个别方法补充显式 None 返回；无核心行为变更。
测试 `.../src/tests/*/`, `hugegraph-python-client/src/tests/*/`	仅空行/字符串引号/调用折行等格式化，无测试逻辑改变。

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as 前端/Gradio UI
  participant API as 后端API(text2gremlin_api)
  participant S as SchedulerSingleton
  participant F as Text2GremlinFlow
  participant N1 as SchemaNode
  participant N2 as GremlinExampleIndexQueryNode
  participant N3 as Text2GremlinNode
  participant N4 as GremlinExecuteNode
  participant LLM as LLM
  participant HG as HugeGraph

  U->>API: POST /text2gremlin (query, example_num, prompt, outputs)
  alt 空查询
    API-->>U: 400 Bad Request
  else 正常
    API->>S: schedule_flow("text2gremlin", params)
    S->>F: 构建并reset状态
    F->>N1: 载入/规范化 schema
    N1-->>F: schema
    F->>N2: 示例匹配(query, example_num)
    N2-->>F: match_result
    F->>N3: 生成 template/raw Gremlin
    N3->>LLM: 文本转Gremlin提示词
    LLM-->>N3: 生成结果
    N3-->>F: 模板/原始 Gremlin
    F->>N4: 按请求键执行 Gremlin
    N4->>HG: 执行 template/raw Gremlin(带默认limit)
    HG-->>N4: 执行结果/错误
    N4-->>F: 写入执行结果
    F-->>S: 聚合输出
    S-->>API: 返回结果字典
    API-->>U: 200 含 match/template/raw/执行结果
  end

sequenceDiagram
  autonumber
  actor U as 前端/Gradio UI
  participant API as 后端API
  participant S as Scheduler
  note over API,S: 错误路径（调度/执行异常）
  U->>API: 请求
  API->>S: 调度执行
  S--xAPI: 抛出异常(记录错误)
  API-->>U: 500 Internal Server Error / UI 捕获并提示

Estimated code review effort

🎯 4 (复杂) | ⏱️ ~60 minutes

Possibly related PRs

Refactor: Refactor hugegraph-ai to using CGraph & port some usecase in web demo #49 — 同样引入/修改调度器与流程/节点（含 text2gremlin、执行节点、状态），与本次对接的 Text2Gremlin 流程具备直接代码层关联。

Poem

小兔敲键轻轻跳，
文转Gremlin新路照。
调度一声管线跑，
模板原始两相报。
轻风拂叶日志笑，
配置添枝更稳杓。
啦啦——查询起，图上描。 🐰✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	标题中 “text2germlin” 拼写错误，应为 “text2gremlin”，导致对主要变更的描述不准确。	请修正标题，将 “text2germlin” 更正为 “text2gremlin”，例如使用 “Refactor: text2gremlin with PCgraph framework” 以准确反映主要改动。
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.86% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-09-27T09:46:11Z

@codecov-ai-reviewer review

Copilot

Pull Request Overview

This PR refactors the text2gremlin functionality to use the PCgraph framework, migrating from a direct implementation to a node-based pipeline architecture. The refactor introduces standardized workflow management and improves modularity by breaking down text2gremlin processing into discrete, reusable nodes.

Key changes:

Introduces new node classes for text2gremlin pipeline components (Text2GremlinNode, GremlinExampleIndexQueryNode, GremlinExecuteNode)
Creates a Text2GremlinFlow class that orchestrates the entire text2gremlin workflow
Updates the scheduler to support the new text2gremlin flow
Modifies existing API and demo code to use the new pipeline-based approach

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
ai_state.py	Adds Text2Gremlin-specific fields to workflow input and state classes
text2gremlin.py	New node class for LLM-based Gremlin query generation
gremlin_example_index_query.py	New node class for querying Gremlin examples from index
schema.py	Replaces debug print with proper logging
gremlin_execute.py	New node class for executing generated Gremlin queries
text2gremlin.py (flow)	New flow class orchestrating the complete text2gremlin pipeline
scheduler.py	Integrates text2gremlin flow and adds state reset functionality
text2gremlin_block.py	Updates demo code to use new scheduler-based approach
rag_api.py	Updates API endpoint to use new scheduler-based approach

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fantasy-lotus · 2025-09-27T10:56:23Z

@codecov-ai-reviewer review
@codecov-ai-reviewer tests
@coderabbitai full review

coderabbitai · 2025-09-27T10:56:46Z

@fantasy-lotus I'll perform a comprehensive full review of all changes in this PR that refactors text2gremlin with the PCgraph framework.

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py (1)
91-94: 请恢复对单字符关键词的支持

这里新增的 len(k.strip()) > 1 过滤会直接丢弃所有单字符关键词。对于中文场景（如“图”“数”）以及仅由单字符缩写构成的英文关键词（如“C”“R”）来说，这是合法且常见的输出，会导致提取结果缺失。建议仅过滤掉空字符串等无效结果。
-                if len(k.strip()) > 1
+                if k.strip()
hugegraph-llm/src/hugegraph_llm/middleware/middleware.py (1)
37-45: 修正日志单位与数值不一致

当 process_time > 1000 时，变量已经换算为秒，但日志仍然写死为 “ms”，会导致运维观测时误判耗时。请保持日志输出的单位与数值一致。
-        process_time = (time.perf_counter() - start_time) * 1000  # ms
-        unit = "ms"
-        if process_time > 1000:
-            process_time /= 1000
-            unit = "s"
+        process_time_ms = (time.perf_counter() - start_time) * 1000
+        process_time = process_time_ms
+        unit = "ms"
+        if process_time_ms > 1000:
+            process_time = process_time_ms / 1000
+            unit = "s"
 
         response.headers["X-Process-Time"] = f"{process_time:.2f} {unit}"
-        log.info("Request process time: %.2f ms, code=%d", process_time, response.status_code)
+        log.info("Request process time: %.2f %s, code=%d", process_time, unit, response.status_code)
hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py (1)

18-18: 修复 PyCGraph.CStatus 导入错误

GitHub Actions 的 pylint 报告 E0611: No name 'CStatus' in module 'PyCGraph'，流水线当前因此失败。请确认 PyCGraph 实际暴露的符号名称（或补充类型定义/忽略配置），并调整这里的导入或依赖版本，以确保 lint 和运行时都能正确加载。
hugegraph-llm/src/hugegraph_llm/indices/vector_index.py (1)
138-147: 修复 Faiss 返回 -1 时的错误命中

Faiss 在找不到足够邻居时会返回索引 -1，目前的实现会把它当作合法索引并取列表最后一个元素，直接导致检索结果被污染。

建议按如下方式跳过占位索引：
-        for dist, i in zip(distances[0], indices[0]):
-            if dist < dis_threshold:  # Smaller distances indicate higher similarity
+        for dist, i in zip(distances[0], indices[0]):
+            if i == -1:
+                log.debug("Faiss returned placeholder index -1, skip this candidate.")
+                continue
+            if dist < dis_threshold:  # Smaller distances indicate higher similarity
                 results.append(deepcopy(self.properties[i]))
                 log.debug("[✓] Add valid distance %s to results.", dist)
             else:
hugegraph-llm/src/hugegraph_llm/state/ai_state.py (1)
16-16: 同步处理 PyCGraph 导入的 pylint 误报

这里同样触发了 E0611: No name 'GParam' in module 'PyCGraph' 和 E0611: No name 'CStatus' in module 'PyCGraph'，目前 CI 已经失败。建议参照其它位置在导入行增加忽略标记，保证流水线能通过，例如：
-from PyCGraph import GParam, CStatus
+from PyCGraph import GParam, CStatus  # pylint: disable=import-error
请一并修正。

🧹 Nitpick comments (4)

hugegraph-python-client/src/pyhugegraph/api/schema.py (2)
85-89: 日志中请包含请求的标签名
当前的错误日志输出 str(response)，在未找到时 response 通常是 None，信息量有限，并且导致我们不得不保留 # pylint: disable=unused-argument。建议直接记录请求的 name，既能提供准确的排查信息，也能移除这条 pylint 忽略。
-    def getVertexLabel(self, name) -> Optional[VertexLabelData]:  # pylint: disable=unused-argument
+    def getVertexLabel(self, name) -> Optional[VertexLabelData]:
         if response := self._invoke_request():
             return VertexLabelData(response)
-        log.error("VertexLabel not found: %s", str(response))
+        log.error("VertexLabel not found: %s", name)
         return None
127-131: 错误日志应指向索引名称
当前在失败路径下打印的是 str(response)，实际多为 None，不利于迅速定位问题，也迫使我们保留 # pylint: disable=unused-argument。建议直接记录请求的 name，并删除这条 pylint 忽略。
-    def getIndexLabel(self, name) -> Optional[IndexLabelData]:  # pylint: disable=unused-argument
+    def getIndexLabel(self, name) -> Optional[IndexLabelData]:
         if response := self._invoke_request():
             return IndexLabelData(response)
-        log.error("IndexLabel not found: %s", str(response))
+        log.error("IndexLabel not found: %s", name)
         return None
hugegraph-llm/src/hugegraph_llm/operators/llm_op/schema_build.py (1)

88-123: 统一 query_examples 的类型契约

build_prompt() 的类型注解仍然是 List[Dict[str, str]]，Docstring 也写明每个示例要包含 “description + Gremlin”，但 run() 新增的校验与报错信息却假定传入的是 list[str]。这会让调用方和类型检查工具拿不准到底该传什么结构，一旦后续在 PCGraph 流里继续传 dict，就会被误导成输入格式错误。建议尽快统一约定：要么将 prompt 构造逻辑彻底切换成接收 list[str] 并同步更新类型注解、文档和元素校验；要么（更符合 “description + Gremlin” 的语义）在这里校验每个元素是包含所需字段的 dict，并把错误信息改成反映真实期望的结构。
hugegraph-llm/src/hugegraph_llm/utils/decorators.py (1)
117-123: 任务ID没有写回上下文

当前逻辑只在返回值拥有 __closure__ 且为真时才写入 task_id，但装饰的函数通常返回 dict 或一般对象，这个分支几乎不会命中，导致下游无法读取到我们刚生成的任务 ID。建议直接在常见返回类型（例如 dict）上补齐写入逻辑，同时保留对可设置属性对象的兼容。
         # Store the original return value
         result = func(*args, **kwargs)
         # Add the task_id to the function's context
-        if hasattr(result, "__closure__") and result.__closure__:
-            # If it's a closure, we can add the task_id to its context
-            setattr(result, "task_id", task_id)
+        if isinstance(result, dict):
+            result.setdefault("task_id", task_id)
+        elif hasattr(result, "__dict__"):
+            setattr(result, "task_id", task_id)
+        elif hasattr(result, "__closure__") and result.__closure__:
+            setattr(result, "task_id", task_id)
         return result

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41aeae5 and 63d3a98.

📒 Files selected for processing (94)

hugegraph-llm/src/hugegraph_llm/api/admin_api.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/api/exceptions/rag_exceptions.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/api/models/rag_requests.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/api/rag_api.py (8 hunks)
hugegraph-llm/src/hugegraph_llm/config/admin_config.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/config/generate.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/config/hugegraph_config.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/config/llm_config.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/config/models/base_config.py (3 hunks)
hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/config/prompt_config.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/admin_block.py (7 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/configs_block.py (17 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/other_block.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py (11 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py (10 hunks)
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py (10 hunks)
hugegraph-llm/src/hugegraph_llm/document/chunk_split.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/flows/get_graph_index_info.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/flows/graph_extract.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/flows/import_graph_data.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/flows/scheduler.py (3 hunks)
hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/indices/graph_index.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/indices/vector_index.py (6 hunks)
hugegraph-llm/src/hugegraph_llm/middleware/middleware.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/models/embeddings/base.py (5 hunks)
hugegraph-llm/src/hugegraph_llm/models/embeddings/openai.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/models/llms/base.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/models/llms/init_llm.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/models/llms/litellm.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/models/llms/ollama.py (6 hunks)
hugegraph-llm/src/hugegraph_llm/models/llms/openai.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/models/rerankers/cohere.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/models/rerankers/init_reranker.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/models/rerankers/siliconflow.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/nodes/index_node/gremlin_example_index_query.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/schema_build.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/text2gremlin.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/common_op/check_schema.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/operators/common_op/merge_dedup_rerank.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/operators/document_op/word_extract.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/commit_to_hugegraph.py (10 hunks)
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py (10 hunks)
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_gremlin_example_index.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_semantic_index.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_vector_index.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/semantic_id_query.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/kg_construction_task.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/answer_synthesize.py (5 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/disambiguate_data.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py (3 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py (3 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/prompt_generate.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/operators/llm_op/schema_build.py (3 hunks)
hugegraph-llm/src/hugegraph_llm/state/ai_state.py (5 hunks)
hugegraph-llm/src/hugegraph_llm/utils/anchor.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/utils/decorators.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/utils/embedding_utils.py (2 hunks)
hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py (7 hunks)
hugegraph-llm/src/hugegraph_llm/utils/hugegraph_utils.py (4 hunks)
hugegraph-llm/src/hugegraph_llm/utils/log.py (1 hunks)
hugegraph-llm/src/hugegraph_llm/utils/vector_index_utils.py (3 hunks)
hugegraph-llm/src/tests/config/test_config.py (1 hunks)
hugegraph-llm/src/tests/models/embeddings/test_openai_embedding.py (1 hunks)
hugegraph-llm/src/tests/models/llms/test_ollama_client.py (1 hunks)
hugegraph-llm/src/tests/operators/common_op/test_check_schema.py (2 hunks)
hugegraph-llm/src/tests/operators/common_op/test_nltk_helper.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/api/auth.py (4 hunks)
hugegraph-python-client/src/pyhugegraph/api/graph.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/api/schema.py (3 hunks)
hugegraph-python-client/src/pyhugegraph/api/schema_manage/index_label.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/api/services.py (2 hunks)
hugegraph-python-client/src/pyhugegraph/api/traverser.py (5 hunks)
hugegraph-python-client/src/pyhugegraph/client.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/example/hugegraph_example.py (2 hunks)
hugegraph-python-client/src/pyhugegraph/structure/property_key_data.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/utils/huge_config.py (2 hunks)
hugegraph-python-client/src/pyhugegraph/utils/huge_router.py (2 hunks)
hugegraph-python-client/src/pyhugegraph/utils/log.py (1 hunks)
hugegraph-python-client/src/pyhugegraph/utils/util.py (3 hunks)
hugegraph-python-client/src/tests/api/test_auth.py (2 hunks)
hugegraph-python-client/src/tests/api/test_version.py (1 hunks)
hugegraph-python-client/src/tests/client_utils.py (1 hunks)

🧰 Additional context used

🧠 Learnings (19)

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/indices/**/*.py : Store vector and graph indexing code under src/hugegraph_llm/indices/

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_vector_index.py
hugegraph-llm/src/hugegraph_llm/indices/graph_index.py
hugegraph-llm/src/hugegraph_llm/utils/vector_index_utils.py
hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py
hugegraph-llm/src/hugegraph_llm/indices/vector_index.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/gremlin_generate_task.py : Maintain the Text2Gremlin pipeline in src/hugegraph_llm/operators/gremlin_generate_task.py

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/llm_op/gremlin_generate.py
hugegraph-llm/src/hugegraph_llm/flows/scheduler.py
hugegraph-llm/src/hugegraph_llm/api/rag_api.py
hugegraph-llm/src/hugegraph_llm/operators/index_op/build_gremlin_example_index.py
hugegraph-llm/src/hugegraph_llm/operators/index_op/gremlin_example_index_query.py
hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
hugegraph-llm/src/hugegraph_llm/nodes/index_node/gremlin_example_index_query.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py
hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py
hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/text2gremlin.py
hugegraph-llm/src/hugegraph_llm/state/ai_state.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/demo/rag_demo/**/*.py : Implement the Gradio UI application under src/hugegraph_llm/demo/rag_demo/

Applied to files:

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/admin_block.py
hugegraph-llm/src/hugegraph_llm/api/rag_api.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/configs_block.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py

📚 Learning: 2025-08-18T14:37:41.224Z

Learnt from: Gfreely
PR: hugegraph/hugegraph-ai#32
File: hugegraph-llm/src/hugegraph_llm/operators/document_op/textrank_word_extract.py:139-141
Timestamp: 2025-08-18T14:37:41.224Z
Learning: In hugegraph-llm textrank_word_extract.py, NLTK resources are packaged locally under src/hugegraph_llm/resources/nltk_data/ and the code properly configures nltk.data.path using get_project_root() to point to these local resources before making NLTK calls, preventing LookupError issues.

Applied to files:

hugegraph-llm/src/tests/config/test_config.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py : Maintain the Graph RAG pipeline in src/hugegraph_llm/operators/graph_rag_task.py

Applied to files:

hugegraph-llm/src/hugegraph_llm/api/rag_api.py
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py
hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/kg_construction_task.py : Maintain the KG Construction pipeline in src/hugegraph_llm/operators/kg_construction_task.py

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/kg_construction_task.py
hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py
hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py

📚 Learning: 2025-06-25T09:50:06.213Z

Learnt from: day0n
PR: hugegraph/hugegraph-ai#16
File: hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py:124-137
Timestamp: 2025-06-25T09:50:06.213Z
Learning: Language-specific prompt attributes (answer_prompt_CN, answer_prompt_EN, extract_graph_prompt_CN, extract_graph_prompt_EN, gremlin_generate_prompt_CN, gremlin_generate_prompt_EN, keywords_extract_prompt_CN, keywords_extract_prompt_EN, doc_input_text_CN, doc_input_text_EN) are defined in the PromptConfig class in hugegraph-llm/src/hugegraph_llm/config/prompt_config.py, which inherits from BasePromptConfig, making these attributes accessible in the parent class methods.

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/llm_op/prompt_generate.py
hugegraph-llm/src/hugegraph_llm/api/models/rag_requests.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/text2gremlin_block.py
hugegraph-llm/src/hugegraph_llm/config/prompt_config.py
hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py
hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
hugegraph-llm/src/hugegraph_llm/operators/llm_op/schema_build.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml : Keep prompt configuration in src/hugegraph_llm/resources/demo/config_prompt.yaml

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/llm_op/prompt_generate.py
hugegraph-llm/src/hugegraph_llm/config/prompt_config.py
hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py
hugegraph-llm/src/hugegraph_llm/operators/llm_op/schema_build.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/config/**/*.py : Keep configuration management code under src/hugegraph_llm/config/

Applied to files:

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/configs_block.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/models/**/*.py : Implement LLM, embedding, and reranker models under src/hugegraph_llm/models/

Applied to files:

hugegraph-llm/src/hugegraph_llm/models/embeddings/base.py
hugegraph-llm/src/hugegraph_llm/models/llms/base.py
hugegraph-llm/src/hugegraph_llm/models/llms/init_llm.py
hugegraph-llm/src/hugegraph_llm/config/llm_config.py

📚 Learning: 2025-08-29T13:11:08.943Z

Learnt from: Gfreely
PR: hugegraph/hugegraph-ai#32
File: hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py:113-154
Timestamp: 2025-08-29T13:11:08.943Z
Learning: 在 hugegraph-llm 的 KeywordExtract 类中，用户 Gfreely 希望保持提取方法的接口一致性：所有提取方法（_extract_with_llm、_extract_with_textrank、_extract_with_hybrid）都返回 Dict[str, float]，由主 run() 方法统一处理排序和截断，而不是在各个提取方法内部进行预处理。这是他们的架构设计偏好。

Applied to files:

hugegraph-llm/src/hugegraph_llm/operators/document_op/word_extract.py
hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/**/*.py : Adhere to ruff code style for Python code

Applied to files:

hugegraph-llm/src/hugegraph_llm/utils/hugegraph_utils.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/utils/**/*.py : Place utilities, logging, and decorators under src/hugegraph_llm/utils/

Applied to files:

hugegraph-llm/src/hugegraph_llm/utils/hugegraph_utils.py

📚 Learning: 2025-06-25T09:45:10.751Z

Learnt from: day0n
PR: hugegraph/hugegraph-ai#16
File: hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py:100-116
Timestamp: 2025-06-25T09:45:10.751Z
Learning: In hugegraph-llm BasePromptConfig class, llm_settings is a runtime property that is loaded from config through dependency injection during object initialization, not a static class attribute. Static analysis tools may flag this as missing but it's intentional design.

Applied to files:

hugegraph-llm/src/hugegraph_llm/config/prompt_config.py
hugegraph-llm/src/hugegraph_llm/config/llm_config.py

📚 Learning: 2025-05-27T06:55:13.779Z

Learnt from: cgwer
PR: hugegraph/hugegraph-ai#10
File: hugegraph-python-client/pyproject.toml:0-0
Timestamp: 2025-05-27T06:55:13.779Z
Learning: The hugegraph-python-client is a component within the hugegraph-ai project repository (apache/incubator-hugegraph-ai), not a standalone repository. When reviewing project URLs in pyproject.toml files within this project, they should point to the main hugegraph-ai repository.

Applied to files:

hugegraph-python-client/src/tests/client_utils.py

📚 Learning: 2025-08-18T13:20:30.343Z

Learnt from: Gfreely
PR: hugegraph/hugegraph-ai#32
File: hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py:61-63
Timestamp: 2025-08-18T13:20:30.343Z
Learning: In hugegraph-llm KeywordExtract, NLTKHelper loads both English and Chinese stopwords during initialization, but the stopwords(lang) method still requires the correct language key ("english" or "chinese") to return the appropriate stopword set for filtering.

Applied to files:

hugegraph-llm/src/tests/operators/common_op/test_nltk_helper.py

📚 Learning: 2025-08-18T13:20:30.343Z

Learnt from: Gfreely
PR: hugegraph/hugegraph-ai#32
File: hugegraph-llm/src/hugegraph_llm/operators/llm_op/keyword_extract.py:61-63
Timestamp: 2025-08-18T13:20:30.343Z
Learning: NLTKHelper in hugegraph-llm uses lazy loading for stopwords and calls nltk.corpus.stopwords.words(lang) directly with the provided language parameter. It does not preload both English and Chinese stopwords - each language is loaded on first access. The lang parameter must match NLTK's expected language codes ("english", "chinese") or it will fail.

Applied to files:

hugegraph-llm/src/tests/operators/common_op/test_nltk_helper.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Configure LLM providers via LiteLLM and openai/ollama clients

Applied to files:

hugegraph-llm/src/hugegraph_llm/config/llm_config.py

📚 Learning: 2025-09-16T06:40:44.968Z

Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Configure HugeGraph connection via environment variables

Applied to files:

hugegraph-llm/src/hugegraph_llm/config/hugegraph_config.py

🪛 GitHub Actions: Pylint

hugegraph-llm/src/hugegraph_llm/nodes/llm_node/schema_build.py

[error] 18-18: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

[warning] 85-85: R1705: Unnecessary "elif" after "return", remove the leading "el" from "elif" (no-else-return)

hugegraph-llm/src/hugegraph_llm/api/admin_api.py

[error] 34-34: E0702: Raising dict while only classes or instances are allowed (raising-bad-type)

hugegraph-llm/src/hugegraph_llm/models/llms/init_llm.py

[warning] 25-25: W0621: Redefining name 'llm_settings' from outer scope (redefined-outer-name)

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py

[error] 20-20: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py

[error] 18-18: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

hugegraph-llm/src/hugegraph_llm/nodes/llm_node/text2gremlin.py

[error] 21-21: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

hugegraph-llm/src/hugegraph_llm/state/ai_state.py

[error] 16-16: E0611: No name 'GParam' in module 'PyCGraph' (no-name-in-module)

[error] 16-16: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

🔇 Additional comments (33)

hugegraph-llm/src/hugegraph_llm/api/exceptions/rag_exceptions.py (1)

24-26: 格式调整合理

仅做了参数换行，提升了可读性，不影响原有行为。

hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py (1)

61-61: 返回语句调整合理

简化为单行后逻辑保持不变，默认提示信息仍能覆盖缺省场景，看起来没有风险。

hugegraph-llm/src/hugegraph_llm/operators/document_op/word_extract.py (1)

62-62: 引号风格统一无行为影响

仅替换了日志调用处的引号样式，功能保持一致。

hugegraph-python-client/src/pyhugegraph/structure/property_key_data.py (1)

65-67: 格式化调整保持原有输出

多行 f-string 与之前逻辑一致，__repr__ 的语义未变更。

hugegraph-llm/src/hugegraph_llm/models/llms/openai.py (2)

45-45: 空字符串默认值一致即可

这里的引号风格调整不影响逻辑，维持现状没问题。

189-189: 流式参数补逗号没问题

仅是格式化补全尾随逗号，对功能无影响，保持即可。

hugegraph-llm/src/hugegraph_llm/models/llms/init_llm.py (1)

176-177: 格式收敛保持行为不变

这一行的单行化调整未改变 generate 调用语义，运行效果保持一致。

hugegraph-python-client/src/tests/api/test_auth.py (2)

101-101: 格式调整保持语义不变

修改为单行调用后可读性更好，逻辑与原实现一致。

136-136: 断言简化正确

断言被压缩为单行但仍覆盖同样的校验，语义未变。

hugegraph-llm/src/hugegraph_llm/flows/import_graph_data.py (1)

38-42: 日志截断格式调整合理

新增括号仅限缩进与可读性优化，未改变原有逻辑，保持现有行为一致。

hugegraph-llm/src/tests/config/test_config.py (1)

25-27: 保持导入区块整洁

在局部导入之间增加空行能提高可读性，这里改动合理。

hugegraph-python-client/src/tests/api/test_version.py (1)

45-48: 字符串引号风格统一

将键访问改为双引号让风格与文件其余部分保持一致，改动良好。

hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py (1)

39-46: 保持原有行为的格式调整

这里将字典推导式压缩到一行，仅为格式化变动，不影响 vertexlabels 的筛选逻辑。

hugegraph-llm/src/tests/operators/common_op/test_check_schema.py (1)

29-40: 测试数据格式化调整无行为变化

将期望数据结构改为单行表示，断言仍验证相同的字典内容，测试语义保持不变。

hugegraph-python-client/src/pyhugegraph/api/auth.py (4)

86-88: 格式调整符合预期

Line 87 将函数签名压缩到单行并保留 pylint 忽略注释，不影响实际行为，保持一致性即可。

116-118: 格式调整符合预期

Line 118 仅对参数声明做单行化处理，未动到方法逻辑，与现有风格保持一致。

128-130: 格式调整符合预期

Line 129 与前述改动一致，仅是签名格式变化，调用路径与返回值不受影响。

201-203: 格式调整符合预期

Line 202 保持 pylint 注释与函数签名同在一行，逻辑保持不变，可继续沿用。

hugegraph-llm/src/hugegraph_llm/models/llms/litellm.py (2)

51-55: 尾逗号有助于未来修改

在 retry_if_exception_type 末尾补上尾逗号能保持参数列表的一致性，并减少后续新增异常类型时的 diff 噪音，赞成保留。

86-88: 异步签名缩进调整

agenerate 的参数缩进与常规风格保持一致，可读性更好，没有功能性风险。

hugegraph-llm/src/tests/models/llms/test_ollama_client.py (1)

35-37: 格式化调整提升可读性

调用语义保持一致，新排版让参数结构更紧凑易读，没有新的风险。

hugegraph-llm/src/hugegraph_llm/models/llms/ollama.py (1)

53-56: 格式调整保持原有含义

此次仅压缩了 usage 字典的书写形式，仍然从响应中读取相同的 token 字段，未引入行为变化。确认无误。

hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py (1)

48-61: 保持原有逻辑 👍

这一段改动仅调整了 KgBuilder 及相关变量的排版，参数顺序与原实现一致，行为不受影响。

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py (1)

65-65: 改用日志输出符合现有模式

这里改成 log.debug 便于集中管理日志级别，比直接 print 更符合当前代码统一的日志策略。

hugegraph-python-client/src/pyhugegraph/example/hugegraph_example.py (2)

29-35: Schema 链式定义更紧凑

这段链式调用补齐了 Person 的 birthDate 属性并为 ActedIn 加上 ifNotExist()，既保持了原有的主键语义，也保证了 schema 初始化的幂等性。赞。

48-48: 示例数据风格统一

把第三部电影的 addVertex 收敛成与前面一致的单行写法，可读性更好，逻辑保持不变。

hugegraph-llm/src/hugegraph_llm/operators/common_op/check_schema.py (1)

79-88: 格式化重构符合预期

这里将 _process_keys 与 _add_missing_properties 的调用改为单行，仅调整排版；默认参数与校验逻辑保持原值，行为未变化，看起来很好。

hugegraph-llm/src/hugegraph_llm/utils/log.py (1)

34-35: 格式化保持一致性即可

Line 34 的尾随逗号处理保持了整段参数列表的风格一致，没有引入功能性变化，改动合理。

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py (1)

168-169: 日志语句简化很好

Line 168 将 log.info 改成单行形式，信息内容未变，代码更简洁清晰，赞同。

hugegraph-llm/src/hugegraph_llm/config/prompt_config.py (1)

26-27: 空行让段落更清晰

Line 26 之后补了一行空白，使注释与属性段落更好分隔，可读性提升，无副作用。

hugegraph-llm/src/hugegraph_llm/operators/index_op/vector_index_query.py (1)

33-39: 参数换行提升可读性

Line 33-39 将 get_index_folder_name 和 get_filename_prefix 的参数按行分隔，可读性更好，同时保持旧逻辑，认可。

hugegraph-llm/src/hugegraph_llm/flows/get_graph_index_info.py (1)

51-51: 调用格式统一即可

Line 51 调整为单行调用 get_index_folder_name，与其他文件保持风格一致，不影响行为。
hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py (1)
105-105: 注释需使用英文

该行仍为中文注释，与仓库统一使用英文的约定（此前 review 已提示）不一致，请改成英文说明以避免风格漂移。
[建议修复如下]
-        # 始终返回 5 个标准键，避免前端因过滤异常看不到字段
+        # Always return 5 standard keys to prevent frontend filters from hiding fields unexpectedly

coderabbitai · 2025-09-27T11:01:59Z

hugegraph-llm/src/hugegraph_llm/api/admin_api.py

+            raise generate_response(
+                RAGResponse(
+                    status_code=status.HTTP_403_FORBIDDEN,  # pylint: disable=E0702
+                    message="Invalid admin_token",
+                )
+            )


⚠️ Potential issue | 🔴 Critical

错误地抛出了非异常对象

这里仍然使用 raise generate_response(...) 抛出一个字典/响应对象，触发运行时报错和当前 Pylint 的 E0702 失败。应该直接返回该响应或改为抛出 HTTPException。建议修改为直接 return generate_response(...) 以保持兼容现有返回路径。

- raise generate_response( - RAGResponse( - status_code=status.HTTP_403_FORBIDDEN, # pylint: disable=E0702 - message="Invalid admin_token", - ) - ) + return generate_response( + RAGResponse( + status_code=status.HTTP_403_FORBIDDEN, + message="Invalid admin_token", + ) + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise generate_response(

RAGResponse(

status_code=status.HTTP_403_FORBIDDEN, # pylint: disable=E0702

message="Invalid admin_token",

)

)

return generate_response(

RAGResponse(

status_code=status.HTTP_403_FORBIDDEN,

message="Invalid admin_token",

)

)

🧰 Tools

🪛 GitHub Actions: Pylint

[error] 34-34: E0702: Raising dict while only classes or instances are allowed (raising-bad-type)

🤖 Prompt for AI Agents

In hugegraph-llm/src/hugegraph_llm/api/admin_api.py around lines 34 to 39, the code incorrectly uses "raise generate_response(...)" which raises a non-exception object and triggers runtime/Pylint E0702; change this to directly "return generate_response(...)" to match the existing response flow, or alternatively raise a proper HTTPException with the same status_code and message if you intend to throw — ensure the chosen approach is applied consistently with surrounding handlers.

coderabbitai · 2025-09-27T11:02:00Z

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/rag_block.py

+            extract_template=keywords_extract_prompt
+        ).keywords_to_vid().import_schema(huge_settings.graph_name).query_graphdb(
            num_gremlin_generate_example=gremlin_tmpl_num,
            gremlin_prompt=gremlin_prompt,
        )


⚠️ Potential issue | 🟠 Major

恢复 streaming 流程的关键词召回筛选参数

这里把 keywords_to_vid 的 vector_dis_threshold 和 topk_per_keyword 参数直接移除了，导致流式回答路径不再应用与非流式路径相同的向量距离阈值和 TopK 筛选逻辑。现有 UI/后端配置里的这些阈值会被忽略，召回范围变宽，结果与非流式接口不一致，实际效果明显退化。请恢复这些参数的传入，保持两条路径的行为一致。

- rag.extract_keywords( - extract_template=keywords_extract_prompt - ).keywords_to_vid().import_schema(huge_settings.graph_name).query_graphdb( + rag.extract_keywords( + extract_template=keywords_extract_prompt + ).keywords_to_vid( + vector_dis_threshold=vector_dis_threshold, + topk_per_keyword=topk_per_keyword, + ).import_schema(huge_settings.graph_name).query_graphdb(

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

extract_template=keywords_extract_prompt

).keywords_to_vid().import_schema(huge_settings.graph_name).query_graphdb(

num_gremlin_generate_example=gremlin_tmpl_num,

gremlin_prompt=gremlin_prompt,

)

rag.extract_keywords(

extract_template=keywords_extract_prompt

).keywords_to_vid(

vector_dis_threshold=vector_dis_threshold,

topk_per_keyword=topk_per_keyword,

).import_schema(huge_settings.graph_name).query_graphdb(

num_gremlin_generate_example=gremlin_tmpl_num,

gremlin_prompt=gremlin_prompt,

)

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py

coderabbitai · 2025-09-27T11:02:00Z

hugegraph-llm/src/hugegraph_llm/nodes/llm_node/text2gremlin.py

+from PyCGraph import CStatus
+
+from hugegraph_llm.nodes.base_node import BaseNode
+from hugegraph_llm.operators.llm_op.gremlin_generate import GremlinGenerateSynthesize
+from hugegraph_llm.models.llms.init_llm import LLMs
+from hugegraph_llm.config import prompt as prompt_cfg
+
+
+def _stable_schema_string(state_json: Dict[str, Any]) -> str:
+    if "simple_schema" in state_json and state_json["simple_schema"] is not None:
+        return json.dumps(state_json["simple_schema"], ensure_ascii=False, sort_keys=True)
+    if "schema" in state_json and state_json["schema"] is not None:
+        return json.dumps(state_json["schema"], ensure_ascii=False, sort_keys=True)
+    return ""
+
+
+class Text2GremlinNode(BaseNode):
+    operator: GremlinGenerateSynthesize
+
+    def node_init(self):
+        # Select LLM
+        llm = LLMs().get_text2gql_llm()
+        # Serialize schema deterministically
+        state_json = self.context.to_json()
+        schema_str = _stable_schema_string(state_json)
+        # Prompt fallback
+        gremlin_prompt: Optional[str] = getattr(self.wk_input, "gremlin_prompt", None)
+        if gremlin_prompt is None or not str(gremlin_prompt).strip():
+            gremlin_prompt = prompt_cfg.gremlin_generate_prompt
+        # Keep vertices/properties empty for now
+        self.operator = GremlinGenerateSynthesize(
+            llm=llm,
+            schema=schema_str,
+            vertices=None,
+            gremlin_prompt=gremlin_prompt,
+        )
+        return CStatus()


⚠️ Potential issue | 🔴 Critical

修复 PyCGraph 状态导入以通过构建

PyCGraph 模块当前并没有导出 CStatus，GitHub Actions 的 pylint 也因此报错（E0611: no-name-in-module）。如果不修正，这个节点在加载阶段就会抛 ImportError，整条 Text2Gremlin 流程无法初始化。请按照 PCGraph SDK 实际提供的状态类来调整——例如改成 SDK 文档中的 Status/StatusCode 并返回 Status.ok()（或对应的成功状态）——以确保 node_init 返回有效的状态对象并让流水线恢复通过。

🧰 Tools

🪛 GitHub Actions: Pylint

[error] 21-21: E0611: No name 'CStatus' in module 'PyCGraph' (no-name-in-module)

🤖 Prompt for AI Agents

In hugegraph-llm/src/hugegraph_llm/nodes/llm_node/text2gremlin.py around lines 21 to 57, the code imports CStatus from PyCGraph which is not exported; update the import to use the actual status type provided by the PCGraph SDK (e.g., import Status or Status/StatusCode per SDK docs), replace the return CStatus() with the SDK's success value (for example return Status.ok() or equivalent success constructor), and ensure any required StatusCode usage is adjusted so node_init returns a valid status object accepted by the pipeline.

coderabbitai · 2025-09-27T11:02:00Z

hugegraph-llm/src/hugegraph_llm/operators/llm_op/answer_synthesize.py

+            context_str = (
+                f"{context_head_str}\n" f"{self._context_body}\n" f"{context_tail_str}".strip("\n")
+            )


⚠️ Potential issue | 🟠 Major

修正 .strip() 作用范围导致的格式回归

这里把 .strip("\n") 只作用在最后一个 f-string 上，无法再去掉前两段拼接时引入的首尾换行，context_head_str 或 context_tail_str 为空时会产生额外空行，行为与之前版本不同。请恢复为对整个拼接后的字符串调用 .strip()；run_streaming、async_generate、async_streaming_generate 中的相同拼接也需要同步调整。

- context_str = ( - f"{context_head_str}\n" f"{self._context_body}\n" f"{context_tail_str}".strip("\n") - ) + context_str = ( + f"{context_head_str}\n" + f"{self._context_body}\n" + f"{context_tail_str}" + ).strip("\n")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

context_str = (

f"{context_head_str}\n" f"{self._context_body}\n" f"{context_tail_str}".strip("\n")

)

context_str = (

f"{context_head_str}\n"

f"{self._context_body}\n"

f"{context_tail_str}"

).strip("\n")

coderabbitai · 2025-09-27T11:02:00Z

hugegraph-python-client/src/tests/client_utils.py

+            url=self.URL,
+            user=self.USERNAME,
+            pwd=self.PASSWORD,
+            graph=self.GRAPH,
+            graphspace=self.GRAPHSPACE,
        )


⚠️ Potential issue | 🟠 Major

Restore timeout argument

这里移除了 timeout 参数，PyHugeClient 现在会退回到内部默认的 (0.5, 15.0)。我们此前在 ClientUtils.TIMEOUT 中专门维护了 10 秒的容忍度，就是为了避免本地/CI 环境下连接风波导致的拒连；现在这个常量被闲置（也意味着行为回退），很容易重现间歇性失败。建议继续显式传入超时，并且同时满足 tuple[float, float] 的类型约束。可以这样调整：

self.client = PyHugeClient( url=self.URL, user=self.USERNAME, pwd=self.PASSWORD, graph=self.GRAPH, graphspace=self.GRAPHSPACE, + timeout=(self.TIMEOUT, self.TIMEOUT), )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

url=self.URL,

user=self.USERNAME,

pwd=self.PASSWORD,

graph=self.GRAPH,

graphspace=self.GRAPHSPACE,

)

self.client = PyHugeClient(

url=self.URL,

user=self.USERNAME,

pwd=self.PASSWORD,

graph=self.GRAPH,

graphspace=self.GRAPHSPACE,

timeout=(self.TIMEOUT, self.TIMEOUT),

)

🤖 Prompt for AI Agents

In hugegraph-python-client/src/tests/client_utils.py around lines 31 to 36, the call creating PyHugeClient removed the timeout argument causing the client to fall back to its internal (0.5, 15.0) timeout; restore explicit timeout by passing ClientUtils.TIMEOUT (or convert it to a tuple[float, float] if needed) into the PyHugeClient constructor so tests use the intended 10s tolerance and avoid intermittent connection failures.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Linyu <94553312+weijinglin@users.noreply.github.com>

fantasy-lotus added 5 commits September 27, 2025 17:44

core(flow): add text2gremlin pipeline; reset state on reuse

ec99461

feat(text2gremlin): add flow and nodes; reuse existing operators

9aa3ee2

api: route /text2gremlin via scheduler; validate empty query

cf4e38c

demo: use scheduler for text2gremlin; keep outputs backward-compatible

ae1ea26

fix

e54c5a6

fantasy-lotus requested a review from Copilot September 27, 2025 09:46

github-actions bot added the llm label Sep 27, 2025

Copilot AI reviewed Sep 27, 2025

View reviewed changes

hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py Show resolved Hide resolved

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py Show resolved Hide resolved

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/gremlin_execute.py Show resolved Hide resolved

fantasy-lotus and others added 3 commits September 27, 2025 17:49

Update hugegraph-llm/src/hugegraph_llm/flows/text2gremlin.py

439d71a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix linter & license

4e81721

fix lint

63d3a98

github-actions bot added the python-client label Sep 27, 2025

coderabbitai bot reviewed Sep 27, 2025

View reviewed changes

Update scheduler.py

d991c57

weijinglin merged commit 78011d3 into hugegraph:agenticrag/dev Sep 29, 2025
8 of 10 checks passed

This was referenced Sep 30, 2025

Refactor RAG Workflow: Modularize Flows, Add Streaming, and Improve Node Initialization #51

Merged

Text2Gremlin Data Generation and Model Fine-Tuning System (Vertical Scenarios and General Scenarios) #52

Open

This was referenced Oct 9, 2025

[Refactor] Add Fixed Workflow Execution Engine: Flow, Node, and Scheduler Architecture #54

Closed

feat(llm): vector-db #45

Closed

coderabbitai bot mentioned this pull request Oct 16, 2025

feat(llm): BREAKING CHANGE update keyword extraction method #32

Closed

weijinglin added a commit that referenced this pull request Oct 22, 2025

Refactor: text2germlin with PCgraph framework (#50)

1c19934

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Linyu <94553312+weijinglin@users.noreply.github.com>

coderabbitai bot mentioned this pull request Oct 23, 2025

feat(llm):improve some RAG function UT(tests) #21

Open

weijinglin added a commit that referenced this pull request Oct 23, 2025

refactor: text2germlin with PCGraph framework (#50)

0c9a305

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Linyu <94553312+weijinglin@users.noreply.github.com>

github-actions bot pushed a commit that referenced this pull request Oct 23, 2025

refactor: text2germlin with PCGraph framework (#50)

d36d41d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Linyu <94553312+weijinglin@users.noreply.github.com>

coderabbitai bot mentioned this pull request Nov 15, 2025

feat: support auto mode for rag usecases #58

Open

coderabbitai bot mentioned this pull request Nov 24, 2025

lint:feat/add ruff mypy #61

Closed

Conversation

fantasy-lotus commented Sep 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Sep 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fantasy-lotus commented Sep 27, 2025

Uh oh!

coderabbitai bot commented Sep 27, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fantasy-lotus commented Sep 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 27, 2025 •

edited

Loading