Skip to content
This repository was archived by the owner on Dec 28, 2025. It is now read-only.

Refactor: Refactor hugegraph-ai to using CGraph & port some usecase in web demo#49

Merged
weijinglin merged 9 commits intoagenticrag/devfrom
wjl-agent/dev
Sep 25, 2025
Merged

Refactor: Refactor hugegraph-ai to using CGraph & port some usecase in web demo#49
weijinglin merged 9 commits intoagenticrag/devfrom
wjl-agent/dev

Conversation

@weijinglin
Copy link
Collaborator

@weijinglin weijinglin commented Sep 25, 2025

Usecase

  • 2.1 build_vector_index workflow porting
  • 2.2 graph_extract workflow porting
  • 2.3 import_graph_data workflow porting
    • Implemented import_graph_data workflow based on Node/Operator mechanism
  • 2.4 update_vid_embeddings workflow porting
    • Implement update_vid_embeddings workflow based on Node/Operator mechanism
  • 2.5 get_graph_index_info workflow porting
  • 2.6 build_schema workflow porting
    • Implemented the build_schema workflow based on the Node/Operator mechanism
  • 2.7 prompt_generate workflow porting
    • Implemented the prompt_generate workflow based on the Node/Operator mechanism

Add necessary doc for ai development

Summary by CodeRabbit

  • 新功能
    • 引入固定流程调度器,支持一键执行:构建向量索引、图谱抽取、导入图数据、更新点向量、获取索引信息、模式构建与提示生成;Demo 与工具函数全面接入调度器。
  • 重构
    • 工作流由算子直连改为节点+流程架构,提升并发安全与复用性,统一输入输出与后处理。
  • 文档
    • 新增设计、需求与任务文档,含架构图、接口约定与用例列表。
  • 其他优化
    • 提示与错误信息更清晰,日志与类型标注完善。

@coderabbitai
Copy link

coderabbitai bot commented Sep 25, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

此次变更引入固定流程框架文档与需求说明;新增并重构多条固定流程(构建向量索引、图抽取、导入数据、更新VID向量、获取索引信息、构建模式、提示词生成);新增统一的Node层与若干节点类;调整Operators由Node解耦为纯操作;扩展Scheduler注册与调度;更新状态结构与若干工具/演示入口。

Changes

Cohort / File(s) Summary
规范与任务文档
.vibedev/spec/hugegraph-llm/fixed_flow/design.md, .vibedev/spec/hugegraph-llm/fixed_flow/requirements.md, .vibedev/spec/hugegraph-llm/fixed_flow/tasks.md
新增固定流程架构、需求与任务文档,描述Scheduler/Flow/Node/Operator解耦、并发与复用策略、数据结构与测试范围。
Scheduler 与入口
hugegraph_llm/flows/scheduler.py, hugegraph_llm/demo/rag_demo/vector_graph_block.py, hugegraph_llm/utils/graph_index_utils.py
Scheduler扩展注册多流程(import_graph_data、update_vid_embeddings、get_graph_index_info、build_schema、prompt_generate);demo与utils改为通过Scheduler调度;保留旧版legacy包装函数。
Flow 实现与调整
hugegraph_llm/flows/build_vector_index.py, hugegraph_llm/flows/graph_extract.py, hugegraph_llm/flows/get_graph_index_info.py, hugegraph_llm/flows/import_graph_data.py, hugegraph_llm/flows/update_vid_embeddings.py, hugegraph_llm/flows/build_schema.py, hugegraph_llm/flows/prompt_generate.py, hugegraph_llm/flows/utils.py
多条Flow新增/改造,改用Node层(如ChunkSplitNode、SchemaNode、ExtractNode等),统一prepare/build_flow/post_deal;新增prepare_schema工具。
Node 基类与通用工具
hugegraph_llm/nodes/base_node.py, hugegraph_llm/nodes/util.py
新增BaseNode统一生命周期(init/node_init/run/operator_schedule)与上下文初始化工具。
Document/Index/LLM/HugeGraph 节点
hugegraph_llm/nodes/document_node/chunk_split.py, .../nodes/index_node/build_vector_index.py, .../nodes/index_node/build_semantic_index.py, .../nodes/llm_node/extract_info.py, .../nodes/llm_node/prompt_generate.py, .../nodes/llm_node/schema_build.py, .../nodes/hugegraph_node/fetch_graph_data.py, .../nodes/hugegraph_node/commit_to_hugegraph.py, .../nodes/hugegraph_node/schema.py
新增多种Node,封装对应Operator(或LLM/客户端),在node_init中校验与装配上下文,在operator_schedule中委托执行。
Operators 重构(解耦 Node)
hugegraph_llm/operators/document_op/chunk_split.py, .../operators/index_op/build_vector_index.py, .../operators/llm_op/info_extract.py, .../operators/llm_op/property_graph_extract.py, .../operators/common_op/check_schema.py, .../operators/hugegraph_op/schema_manager.py, .../operators/hugegraph_op/commit_to_hugegraph.py
去除Node化实现,简化为纯函数/类操作,输入输出采用dict上下文;SchemaManager/CheckSchema等重写;部分格式化与轻微日志变更。
状态结构扩展
hugegraph_llm/state/ai_state.py
WkFlowInput新增data_json、extract_type、query_examples、few_shot_schema、source_text、scenario、example_name;WkFlowState新增generated_extract_prompt与assign_from_json。

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client as 调用方
  participant S as Scheduler
  participant M as GPipelineManager
  participant F as Flow
  participant P as GPipeline
  participant N as Node
  participant O as Operator

  Client->>S: schedule_flow(task, inputs)
  S->>M: get_manager(task)
  alt 首次/无可复用Pipeline
    S->>F: build_flow(inputs)
    F->>P: 创建并装配Pipeline
    P-->>S: 返回Pipeline
  else 复用Pipeline
    S->>P: 准备输入并运行
  end
  S->>P: run()
  loop 节点执行
    P->>N: run()
    N->>O: operator_schedule(state_json)
    O-->>N: 处理结果
    N-->>P: CStatus
  end
  P-->>S: wkflow_state
  S-->>Client: 返回结果
Loading
sequenceDiagram
  autonumber
  actor UI as Web Demo
  participant S as Scheduler
  participant F as PromptGenerateFlow
  participant P as GPipeline
  participant N as PromptGenerateNode
  participant O as PromptGenerate(Operator)
  participant L as LLM

  UI->>S: prompt_generate(source_text, scenario, example_name)
  S->>F: build_flow(...)
  F->>P: 注册 PromptGenerateNode
  S->>P: run()
  P->>N: run()
  N->>O: operator_schedule(state_json)
  O->>L: 调用LLM生成提示词
  L-->>O: 文本
  O-->>N: {"generated_extract_prompt": "..."}
  N-->>P: CStatus
  P-->>S: wkflow_state
  S-->>UI: 返回prompt
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

我是代码田里的小兔子,蹦跳叮当响,
线缆织成Flow与Node,管道里水流淌。
调度一声令下,星轨排作行,
提示生,索引成,模式在月光。
咔哒——新引擎转,胡萝卜也芬芳。 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.86% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed 标题清楚地概括了将 hugegraph-ai 重构为使用 CGraph 并端口 Web Demo 用例的主要改动,与变更内容高度相关。尽管前缀“Refactor:”重复且“usecase”应为复数形式,但整体不算模糊也不偏离主题。

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

@codecov-ai-reviewer review

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 353 files.

Valid Invalid Ignored Fixed
285 1 67 0
Click to see the invalid file list
  • hugegraph-llm/test_prompt_generate_workflow.py
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

@github-actions github-actions bot added the llm label Sep 25, 2025
@weijinglin
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Sep 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
hugegraph-llm/src/hugegraph_llm/state/ai_state.py (1)

103-110: 安全与稳健性:assign_from_json 允许任意键注入,可能遮蔽方法或破坏实例状态

当前实现会把任意键设置为实例属性,可能覆盖如 to_json 等方法名,或注入未知字段,带来稳定性与安全隐患。建议仅赋值已存在且非可调用的属性。

应用如下补丁:

-    # Implement a method that assigns keys from data_json as WkFlowState member variables
-    def assign_from_json(self, data_json: dict):
-        """
-        Assigns each key in the input json object as a member variable of WkFlowState.
-        """
-        for k, v in data_json.items():
-            setattr(self, k, v)
+    # Implement a method that assigns keys from data_json as WkFlowState member variables
+    def assign_from_json(self, data_json: dict):
+        """
+        Assign only to existing non-callable attributes; ignore unknown/private keys.
+        """
+        for k, v in data_json.items():
+            if k.startswith("_"):
+                continue
+            try:
+                attr = getattr(self, k)
+            except AttributeError:
+                continue
+            if callable(attr):
+                continue
+            setattr(self, k, v)
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/extract_info.py (1)

48-53: 调度阶段缺少异常兜底。

run() 抛出的异常会穿透到调度器。请捕获并返回 CStatus,保证流程稳定。

     def operator_schedule(self, data_json):
         if self.extract_type == "triples":
-            return self.info_extract.run(data_json)
+            try:
+                return self.info_extract.run(data_json)
+            except Exception as exc:
+                return CStatus(-1, f"InfoExtract failed: {exc}")
         elif self.extract_type == "property_graph":
-            return self.property_graph_extract.run(data_json)
+            try:
+                return self.property_graph_extract.run(data_json)
+            except Exception as exc:
+                return CStatus(-1, f"PropertyGraphExtract failed: {exc}")
🧹 Nitpick comments (20)
hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/commit_to_hugegraph.py (1)

43-47: 请移除直接的 print 调试输出

生产路径中使用 print 会污染标准输出并绕过统一日志体系,建议改用现有的 log

-        print(f"get schema {schema}")
+        log.debug("Received schema: %s", schema)
hugegraph-llm/src/hugegraph_llm/nodes/util.py (1)

19-27: 细化缺参错误信息,便于定位问题

当前返回“Required workflow parameters not found”无法区分缺失项。建议分别检测并回传具体缺少的参数键,提升可观测性。

应用如下补丁:

 def init_context(obj) -> CStatus:
     try:
-        obj.context = obj.getGParamWithNoEmpty("wkflow_state")
-        obj.wk_input = obj.getGParamWithNoEmpty("wkflow_input")
-        if obj.context is None or obj.wk_input is None:
-            return CStatus(-1, "Required workflow parameters not found")
-        return CStatus()
+        context = obj.getGParamWithNoEmpty("wkflow_state")
+        wk_input = obj.getGParamWithNoEmpty("wkflow_input")
+        if context is None:
+            return CStatus(-1, "Required parameter 'wkflow_state' not found")
+        if wk_input is None:
+            return CStatus(-1, "Required parameter 'wkflow_input' not found")
+        obj.context = context
+        obj.wk_input = wk_input
+        return CStatus()
     except Exception as e:
         return CStatus(-1, f"Failed to initialize context: {str(e)}")
hugegraph-llm/src/hugegraph_llm/state/ai_state.py (1)

28-35: 为新增输入字段补充类型注解,提升可读性与静态检查效果

这些字段目前缺少类型注解,建议补全为 Optional[...]。

应用如下补丁:

-    data_json = None
-    extract_type = None
-    query_examples = None
-    few_shot_schema = None
+    data_json: Optional[dict] = None
+    extract_type: Optional[str] = None
+    query_examples: Optional[List[Any]] = None
+    few_shot_schema: Optional[Any] = None
-    # Fields related to PromptGenerate
-    source_text: str = None  # Original text
-    scenario: str = None  # Scenario description
-    example_name: str = None  # Example name
+    # Fields related to PromptGenerate
+    source_text: Optional[str] = None  # Original text
+    scenario: Optional[str] = None  # Scenario description
+    example_name: Optional[str] = None  # Example name
.vibedev/spec/hugegraph-llm/fixed_flow/design.md (1)

515-524: 文档命名与实现不一致:SchemaManagerNode vs SchemaNode

实现文件为 SchemaNode(nodes/hugegraph_node/schema.py),文档此处称 SchemaManagerNode/CheckSchemaNode。建议统一为 SchemaNode,并在描述中说明其内部路由到 SchemaManager 或 CheckSchema。

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py (1)

64-75: 避免使用 print;按初始化结果路由到对应算子

根据节点初始化结果路由至 check_schema 或 schema_manager,且使用统一日志。

应用如下补丁:

 def operator_schedule(self, data_json):
-    print(f"check data json {data_json}")
-    if self.schema.startswith("{"):
-        try:
-            return self.check_schema.run(data_json)
-        except json.JSONDecodeError as exc:
-            log.error("Invalid JSON format in schema. Please check it again.")
-            raise ValueError("Invalid JSON format in schema.") from exc
-    else:
-        log.info("Get schema '%s' from graphdb.", self.schema)
-        return self.schema_manager.run(data_json)
+    log.debug("SchemaNode.operator_schedule input: %s", data_json)
+    if getattr(self, "check_schema", None) is not None:
+        return self.check_schema.run(data_json)
+    if getattr(self, "schema_manager", None) is not None:
+        return self.schema_manager.run(data_json)
+    return CStatus(-1, "SchemaNode is not initialized")
hugegraph-llm/src/hugegraph_llm/flows/import_graph_data.py (2)

54-57: 避免在核心 Flow 中引入 UI 副作用(gr.Info)

Flow 层应保持纯业务编排,UI 通知应放在调用方(web/demo)以免引入运行时 UI 依赖和测试困难。

建议移除 gr.Info 调用:

     def post_deal(self, pipeline=None):
         res = pipeline.getGParamWithNoEmpty("wkflow_state").to_json()
-        gr.Info("Import graph data successfully!")
         return json.dumps(res, ensure_ascii=False, indent=2)

18-19: 移除未使用的 UI 依赖

若按上条建议移除 gr.Info,则可同步去掉 gradio 依赖,减小运行面。

-import gradio as gr
 from PyCGraph import GPipeline
hugegraph-llm/src/hugegraph_llm/flows/scheduler.py (2)

75-100: 避免局部变量名遮蔽入参 flow,提升可读性

本地变量 flow 覆盖了方法入参 flow(字符串),易混淆。建议更名。

-        flow: BaseFlow = self.pipeline_pool[flow]["flow"]
+        flow_obj: BaseFlow = self.pipeline_pool[flow]["flow"]
         pipeline: GPipeline = manager.fetch()
         if pipeline is None:
             # call coresponding flow_func to create new workflow
-            pipeline = flow.build_flow(*args, **kwargs)
+            pipeline = flow_obj.build_flow(*args, **kwargs)
             status = pipeline.init()
@@
-            res = flow.post_deal(pipeline)
+            res = flow_obj.post_deal(pipeline)
             manager.add(pipeline)
             return res
         else:
             # fetch pipeline & prepare input for flow
             prepared_input = pipeline.getGParamWithNoEmpty("wkflow_input")
-            flow.prepare(prepared_input, *args, **kwargs)
+            flow_obj.prepare(prepared_input, *args, **kwargs)
             status = pipeline.run()
             if status.isErr():
                 raise RuntimeError(f"Error in flow execution {status.getInfo()}")
-            res = flow.post_deal(pipeline)
+            res = flow_obj.post_deal(pipeline)
             manager.release(pipeline)
             return res

31-66: 未使用的 max_pipeline 参数

max_pipeline 目前仅保存未使用。建议后续用于限制 manager 内活跃/缓存的 pipeline 数量,或移除该参数。

hugegraph-llm/src/hugegraph_llm/nodes/document_node/chunk_split.py (1)

37-41: 为空文本输入添加快速失败

当 texts 为空时,尽早返回错误可避免后续算子无效执行。

         if isinstance(texts, str):
             texts = [texts]
+        if not texts:
+            return CStatus(-1, "No texts provided for chunk split")
         self.chunk_split_op = ChunkSplit(texts, split_type, language)
         return CStatus()
hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py (4)

47-50: 移除未使用的 split_text 辅助函数

当前模块未引用,建议删除以减小干扰。

-def split_text(text: str) -> List[str]:
-    chunk_splitter = ChunkSplitter(split_type="paragraph", language="zh")
-    chunks = chunk_splitter.split(text)
-    return chunks

65-66: 降低冗长日志级别以免污染 INFO

properties_map 体量可能较大,建议改为 debug。

-    log.info("properties_map: %s", properties_map)
+    log.debug("properties_map: %s", properties_map)

92-99: 为上下文必需键添加快速校验与简化初始化

避免 KeyError,同时用 setdefault 精简初始化。

-    def run(self, context: Dict[str, Any]) -> Dict[str, List[Any]]:
-        schema = context["schema"]
-        chunks = context["chunks"]
-        if "vertices" not in context:
-            context["vertices"] = []
-        if "edges" not in context:
-            context["edges"] = []
+    def run(self, context: Dict[str, Any]) -> Dict[str, List[Any]]:
+        if "schema" not in context or "chunks" not in context:
+            raise KeyError("context requires 'schema' and 'chunks'")
+        schema = context["schema"]
+        chunks = context["chunks"]
+        context.setdefault("vertices", [])
+        context.setdefault("edges", [])
         items = []

120-123: 避免局部变量名与模块 prompt 同名造成可读性混淆

重命名为 prompt_text 更直观。

-        prompt = generate_extract_property_graph_prompt(chunk, schema)
-        if self.example_prompt is not None:
-            prompt = self.example_prompt + prompt
-        return self.llm.generate(prompt=prompt)
+        prompt_text = generate_extract_property_graph_prompt(chunk, schema)
+        if self.example_prompt is not None:
+            prompt_text = self.example_prompt + prompt_text
+        return self.llm.generate(prompt=prompt_text)
hugegraph-llm/src/hugegraph_llm/flows/update_vid_embeddings.py (1)

27-35: prepare() 的返回状态未处理,可能掩盖初始化失败

build_flow 调用 prepare() 但未检查 CStatus,初始化失败仍继续注册节点,后续异常更隐蔽。

建议在此处检查 isErr() 并中止/抛错,或统一至 BaseFlow 的约定(如果有)。例如:

-        self.prepare(prepared_input)
+        sts = self.prepare(prepared_input)
+        if hasattr(sts, "isErr") and sts.isErr():
+            # 可根据你们的约定选择抛错或返回 None 交由上层处理
+            raise RuntimeError("prepare() failed for UpdateVidEmbeddingsFlows")
hugegraph-llm/src/hugegraph_llm/nodes/base_node.py (1)

52-57: 统一 operator_schedule 返回约定,避免歧义

注释宣称返回 CStatus,但 run() 中当作数据字典使用。应明确:返回 dict 合并上下文,或返回 CStatus 表示失败。

-        节点调度operator的接口,子类可重写。
-        返回CStatus对象,表示调度是否成功。
+        节点调度 operator 的接口,子类可重写。
+        返回:
+          - dict: 将合并进 context
+          - CStatus: 表示错误/状态(非 OK 时中止)
hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py (1)

25-28: 移除空的 init 以简化代码

无状态且无逻辑的构造器可去除。

-class PromptGenerateFlow(BaseFlow):
-    def __init__(self):
-        pass
+class PromptGenerateFlow(BaseFlow):
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/prompt_generate.py (1)

33-45: 补充健壮性:校验 wk_input/LLM 实例

  • 若 init_context 未正确注入 wk_input,访问属性会抛 AttributeError。
  • get_chat_llm 失败返回 None 时,后续 PromptGenerate 初始化将失败。
    [scheggest_recommended_refactor]
-        llm = get_chat_llm(llm_settings)
-        if not all(
+        llm = get_chat_llm(llm_settings)
+        if llm is None:
+            return CStatus(-1, "LLM initialization failed")
+        if not self.wk_input or not all(
             [
                 self.wk_input.source_text,
                 self.wk_input.scenario,
                 self.wk_input.example_name,
             ]
         ):
             return CStatus(
                 -1,
                 "Missing required parameters: source_text, scenario, or example_name",
             )
hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py (1)

186-196: 异常处理不应打印堆栈到 stdout;保持统一的用户提示

traceback.print_exc() 在 Web 场景下噪声较大;建议仅日志文件记录详细堆栈,UI 用 gr.Warning。

-    except Exception as e:  # pylint: disable=W0718
-        log.error(e)
-        traceback.print_exc()
+    except Exception as e:  # pylint: disable=W0718
+        log.exception("import_graph_data failed")
         # Note: can't use gr.Error here
-        gr.Warning(str(e) + " Please check the graph data format/type carefully.")
+        gr.Warning(str(e) + " Please check the graph data format/type carefully.")
         return data
hugegraph-llm/src/hugegraph_llm/nodes/llm_node/schema_build.py (1)

33-46: 建议补充对 wk_input/LLM 的健壮性校验

为与 PromptGenerateNode 一致,建议在 node_init 初始阶段校验 llm 与 wk_input,并在缺失时返回 CStatus(-1, msg)。

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dae3e24 and e8de2d4.

📒 Files selected for processing (33)
  • .vibedev/spec/hugegraph-llm/fixed_flow/design.md (1 hunks)
  • .vibedev/spec/hugegraph-llm/fixed_flow/requirements.md (1 hunks)
  • .vibedev/spec/hugegraph-llm/fixed_flow/tasks.md (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py (2 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/build_schema.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/build_vector_index.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/get_graph_index_info.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/graph_extract.py (3 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/import_graph_data.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/scheduler.py (3 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/update_vid_embeddings.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/flows/utils.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/base_node.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/document_node/chunk_split.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/commit_to_hugegraph.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/fetch_graph_data.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/schema.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/index_node/build_semantic_index.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/index_node/build_vector_index.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/llm_node/extract_info.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/llm_node/prompt_generate.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/llm_node/schema_build.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/nodes/util.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/common_op/check_schema.py (0 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/document_op/chunk_split.py (0 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/commit_to_hugegraph.py (10 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py (0 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/index_op/build_vector_index.py (0 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py (0 hunks)
  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py (1 hunks)
  • hugegraph-llm/src/hugegraph_llm/state/ai_state.py (5 hunks)
  • hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py (4 hunks)
💤 Files with no reviewable changes (5)
  • hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/schema_manager.py
  • hugegraph-llm/src/hugegraph_llm/operators/index_op/build_vector_index.py
  • hugegraph-llm/src/hugegraph_llm/operators/common_op/check_schema.py
  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py
  • hugegraph-llm/src/hugegraph_llm/operators/document_op/chunk_split.py
🧰 Additional context used
🧠 Learnings (8)
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/**/*.py : Put core processing pipelines under src/hugegraph_llm/operators/

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py
  • hugegraph-llm/src/hugegraph_llm/flows/build_vector_index.py
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/graph_rag_task.py : Maintain the Graph RAG pipeline in src/hugegraph_llm/operators/graph_rag_task.py

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py
  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
📚 Learning: 2025-06-25T09:50:06.213Z
Learnt from: day0n
PR: hugegraph/hugegraph-ai#16
File: hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py:124-137
Timestamp: 2025-06-25T09:50:06.213Z
Learning: Language-specific prompt attributes (answer_prompt_CN, answer_prompt_EN, extract_graph_prompt_CN, extract_graph_prompt_EN, gremlin_generate_prompt_CN, gremlin_generate_prompt_EN, keywords_extract_prompt_CN, keywords_extract_prompt_EN, doc_input_text_CN, doc_input_text_EN) are defined in the PromptConfig class in hugegraph-llm/src/hugegraph_llm/config/prompt_config.py, which inherits from BasePromptConfig, making these attributes accessible in the parent class methods.

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py
  • hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py
  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
  • hugegraph-llm/src/hugegraph_llm/nodes/llm_node/prompt_generate.py
📚 Learning: 2025-06-25T09:45:10.751Z
Learnt from: day0n
PR: hugegraph/hugegraph-ai#16
File: hugegraph-llm/src/hugegraph_llm/config/models/base_prompt_config.py:100-116
Timestamp: 2025-06-25T09:45:10.751Z
Learning: In hugegraph-llm BasePromptConfig class, llm_settings is a runtime property that is loaded from config through dependency injection during object initialization, not a static class attribute. Static analysis tools may flag this as missing but it's intentional design.

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py
  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml : Keep prompt configuration in src/hugegraph_llm/resources/demo/config_prompt.yaml

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py
  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/models/**/*.py : Implement LLM, embedding, and reranker models under src/hugegraph_llm/models/

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/indices/**/*.py : Store vector and graph indexing code under src/hugegraph_llm/indices/

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/flows/build_vector_index.py
  • hugegraph-llm/src/hugegraph_llm/utils/graph_index_utils.py
📚 Learning: 2025-09-16T06:40:44.968Z
Learnt from: CR
PR: hugegraph/hugegraph-ai#0
File: hugegraph-llm/AGENTS.md:0-0
Timestamp: 2025-09-16T06:40:44.968Z
Learning: Applies to hugegraph-llm/src/hugegraph_llm/operators/kg_construction_task.py : Maintain the KG Construction pipeline in src/hugegraph_llm/operators/kg_construction_task.py

Applied to files:

  • hugegraph-llm/src/hugegraph_llm/flows/build_vector_index.py
🪛 markdownlint-cli2 (0.18.1)
.vibedev/spec/hugegraph-llm/fixed_flow/tasks.md

21-21: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (8)
.vibedev/spec/hugegraph-llm/fixed_flow/tasks.md (1)

1-36: 文档结构清晰、任务拆分合理

任务列表对齐本次重构目标,勾选项直观可追踪,便于后续迭代。

.vibedev/spec/hugegraph-llm/fixed_flow/requirements.md (1)

1-25: 文档条理清晰

需求与验收标准明确,已完成功能列表完整。

hugegraph-llm/src/hugegraph_llm/flows/build_vector_index.py (1)

17-18: 批准节点层迁移调整 引用已从 operators 切换到 nodes,符合 Node/Operator 解耦方向,依赖关系清晰。已确认 ChunkSplitNode 与 BuildVectorIndexNode 正确定义并可导入。

hugegraph-llm/src/hugegraph_llm/nodes/hugegraph_node/commit_to_hugegraph.py (1)

27-33: 无需在 node_init 中判空 context
context 已由 BaseNode.init 调用 init_context 保证注入并非空,无需额外校验。

Likely an incorrect or invalid review comment.

hugegraph-llm/src/hugegraph_llm/flows/graph_extract.py (1)

39-58: 确认 ExtractNode 已正确消费 extract_type

ExtractNode 在 node_init 中通过 self.wk_input.extract_type 读取并赋值给 self.extract_typeoperator_schedule 已按该值分支执行,参数无遗漏。

hugegraph-llm/src/hugegraph_llm/demo/rag_demo/vector_graph_block.py (1)

71-78: 确认调度结果类型与 UI 输出组件一致

generate_prompt_for_ui 返回 scheduler.schedule_flow 的结果;该结果需为字符串以填充 gr.Code(language="markdown")。若返回 dict/对象,将显示异常。

请确认 PromptGenerateFlow.post_deal 返回为字符串(见 flows/prompt_generate.py 的实现)。如存在不一致,建议在此处统一转为字符串。

hugegraph-llm/src/hugegraph_llm/flows/prompt_generate.py (1)

60-63: 确认状态键名与算子输出一致

post_deal 读取 generated_extract_prompt。请确认 PromptGenerate.run 的返回/上下文写入使用相同键名,否则将返回默认失败文案。

如需增强健壮性,可提供备用键或回退逻辑(如从 context 中不同字段容错读取)。

hugegraph-llm/src/hugegraph_llm/nodes/base_node.py (1)

16-16: 确认PyCGraph导入与依赖名一致
仓库中多处使用 from PyCGraph import …,但 pyproject.toml 中声明依赖为 pycgraph(小写),请确认安装包名与导入名称及大小写一致,避免 CI/部署 报错。

@weijinglin weijinglin merged commit 41aeae5 into agenticrag/dev Sep 25, 2025
7 of 10 checks passed
@weijinglin weijinglin deleted the wjl-agent/dev branch September 25, 2025 07:09
weijinglin added a commit that referenced this pull request Oct 22, 2025
weijinglin added a commit that referenced this pull request Oct 23, 2025
github-actions bot pushed a commit that referenced this pull request Oct 23, 2025
@coderabbitai coderabbitai bot mentioned this pull request Nov 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant