Add post-run learning workflow, YAML agent spec bridge, and tests#70
Add post-run learning workflow, YAML agent spec bridge, and tests#70
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 208f23d461
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| from typing import Any | ||
|
|
||
| from automa_ai.blackboard.store import BlackboardStore | ||
| from automa_ai.config.agent_spec import YamlAgentSpec |
There was a problem hiding this comment.
Break import cycle in learning workflow module
Importing YamlAgentSpec at module load time creates a circular dependency: langgraph_chatagent -> learning.workflow -> config.agent_spec -> agent_factory -> langgraph_chatagent. Because GenericLangGraphChatAgent is imported before its class definition is executed, this can raise a partially-initialized-module ImportError and prevent agent modules from importing at all when dependencies are installed. Move this import into run_learning_workflow (or otherwise decouple agent_spec from agent_factory) so loading the chat agent does not recursively import itself.
Useful? React with 👍 / 👎.
| ) -> None: | ||
| Path(output_dir).mkdir(parents=True, exist_ok=True) | ||
| stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ") | ||
| out_path = Path(output_dir) / f"{session_id}-{stamp}.lesson.json" |
There was a problem hiding this comment.
Sanitize session_id before building lesson output path
The lesson filename is composed directly from session_id, so a session value containing path separators (for example ../...) can escape output_dir and write files outside the intended artifact directory. This is a path traversal risk whenever session_id comes from request/user input. Normalize or sanitize session_id to a safe basename (or reject unsafe characters) before constructing out_path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds an MVP “post-run learning” pipeline (payload → reflection → lessons persisted to disk) plus a YAML-based agent spec bridge that maps onto the existing AgentFactory constructor surface, with unit tests covering payload normalization, YAML mapping, and background scheduling behavior.
Changes:
- Introduces
LearningWorkflowConfigand wireslearning_configthroughAgentFactoryintoGenericLangGraphChatAgent. - Implements
automa_ai.learning.workflow(payload building, workflow execution, lesson persistence) and exports viaautoma_ai.learning. - Adds
YamlAgentSpecfor versioned YAML parsing/mapping toAgentFactorykwargs, along with example YAML specs and tests.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
automa_ai/learning/workflow.py |
Implements payload normalization, reflection/lesson invocation, and persistence. |
automa_ai/learning/__init__.py |
Exposes learning workflow API. |
automa_ai/config/learning.py |
Adds the learning workflow config model and defaults. |
automa_ai/config/agent_spec.py |
Adds YAML agent spec parsing and mapping to AgentFactory. |
automa_ai/agents/langgraph_chatagent.py |
Adds background learning scheduling after completion and returns final output item for learning capture. |
automa_ai/agents/agent_factory.py |
Threads learning_config through to LangGraph chat agents. |
automa_ai/config/__init__.py |
Exports LearningWorkflowConfig. |
tests/learning/test_* |
Adds unit tests for YAML spec mapping, scheduling behavior, and payload normalization. |
examples/configs/specs/learning/*.yaml |
Adds example reflection/lesson agent YAML specs. |
automa_ai/agents/AGENT_FACTORY_CONFIG_SURFACE.md |
Documents AgentFactory config surface and YAML wrapper rationale. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from pathlib import Path | ||
| from typing import Any, Literal | ||
|
|
||
| import yaml |
There was a problem hiding this comment.
yaml is imported unconditionally, but pyyaml appears to be an optional extra (not a core dependency). This makes importing automa_ai.config.agent_spec (and anything that imports it, like the learning workflow) fail in default installs. Consider either adding pyyaml to core dependencies or moving the import inside from_yaml_* with a clear ImportError instructing how to install the extra.
| import yaml | |
| try: | |
| import yaml | |
| except ImportError: # pragma: no cover | |
| class _YAMLImportProxy: | |
| def __getattr__(self, _name: str) -> Any: | |
| raise ImportError( | |
| "PyYAML is required to load YAML agent specifications. " | |
| "Install it with `pip install automa-ai[yaml]` or `pip install pyyaml`." | |
| ) | |
| yaml = _YAMLImportProxy() # type: ignore[assignment] |
| return { | ||
| "kind": "langgraph_messages", | ||
| "message_count": len(messages), | ||
| "last_message": last_content, | ||
| } |
There was a problem hiding this comment.
When final_response is a dict containing messages, _normalize_final_response replaces the whole dict with a compact summary and drops other fields (e.g., status, errors, validator_failures). Since build_review_payload later tries to extract those fields from the normalized output, this can silently lose failure metadata. Consider preserving/merging whitelisted keys from the original dict into the normalized result.
| return { | |
| "kind": "langgraph_messages", | |
| "message_count": len(messages), | |
| "last_message": last_content, | |
| } | |
| summary: dict[str, Any] = { | |
| "kind": "langgraph_messages", | |
| "message_count": len(messages), | |
| "last_message": last_content, | |
| } | |
| # Preserve selected metadata fields from the original response so that | |
| # downstream consumers (e.g., build_review_payload) can still access | |
| # failure/status information after normalization. | |
| for meta_key in ("status", "errors", "validator_failures"): | |
| if meta_key in final_response: | |
| summary[meta_key] = final_response[meta_key] | |
| return summary |
| def _extract_response_text(response: Any) -> Any: | ||
| if isinstance(response, str): | ||
| return response | ||
| if isinstance(response, dict): | ||
| messages = response.get("messages") | ||
| if isinstance(messages, list) and messages: | ||
| last = messages[-1] | ||
| content = getattr(last, "content", None) | ||
| if content is not None: | ||
| return content | ||
| if isinstance(last, dict): | ||
| return last.get("content", str(last)) | ||
| return response | ||
| return str(response) |
There was a problem hiding this comment.
_extract_response_text may return the original response dict (when messages is missing/empty). That dict can include non-JSON-serializable objects, and it later gets embedded into _json_prompt/persisted to disk. To keep the workflow best-effort, consider always returning a string (or JSON-safe structure) here, with a safe fallback serialization (e.g., json.dumps(..., default=str) or str(...)).
| Path(output_dir).mkdir(parents=True, exist_ok=True) | ||
| stamp = datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ") | ||
| out_path = Path(output_dir) / f"{session_id}-{stamp}.lesson.json" |
There was a problem hiding this comment.
session_id is interpolated directly into the output filename. If session_id can contain path separators (e.g., user-supplied IDs), this can lead to path traversal and writing outside output_dir. Consider sanitizing session_id to a safe filename (e.g., allowlist [A-Za-z0-9._-] and replace others) or hashing it for the persisted artifact name.
| "reflection_output": reflection_output, | ||
| "lesson_output": lesson_output, | ||
| } | ||
| out_path.write_text(json.dumps(bundle, ensure_ascii=False, indent=2), encoding="utf-8") |
There was a problem hiding this comment.
json.dumps(bundle, ...) can raise TypeError if payload, reflection_output, or lesson_output contain non-JSON-serializable values (e.g., message objects). Since this runs in a background task, it will cause lesson persistence to fail entirely. Consider using a safe JSON dump (e.g., default=str or a pre-normalization step) to guarantee persistence.
| out_path.write_text(json.dumps(bundle, ensure_ascii=False, indent=2), encoding="utf-8") | |
| out_path.write_text( | |
| json.dumps(bundle, ensure_ascii=False, indent=2, default=str), | |
| encoding="utf-8", | |
| ) |
| reflection_agent_spec_path: str = "examples/configs/specs/learning/reflection_agent.yaml" | ||
| lesson_agent_spec_path: str = "examples/configs/specs/learning/lesson_agent.yaml" | ||
| output_dir: str = "artifacts/learning_lessons" |
There was a problem hiding this comment.
The default spec paths point into examples/…, which may not exist in packaged/distributed installs or when the working directory differs. Consider making these required (no defaults), resolving them relative to the installed package via resources, or documenting that callers must override them when enabling learning.
| reflection_agent_spec_path: str = "examples/configs/specs/learning/reflection_agent.yaml" | |
| lesson_agent_spec_path: str = "examples/configs/specs/learning/lesson_agent.yaml" | |
| output_dir: str = "artifacts/learning_lessons" | |
| reflection_agent_spec_path: str | None = None | |
| lesson_agent_spec_path: str | None = None | |
| output_dir: str | None = None |
Motivation
AgentFactorycall surface.Description
LearningWorkflowConfigmodel inautoma_ai.config.learningand export it fromautoma_ai.config.__init__, and wirelearning_configthrough theAgentFactoryconstructor.GenericLangGraphChatAgentto acceptlearning_config, change_emit_final_outputto return the final item, and add_trigger_learning_workflowwhich schedulesrun_learning_workflowviaasyncio.create_taskand logs background failures in_on_learning_task_done.automa_ai.learning.workflowwithbuild_review_payload,run_learning_workflow, response normalization, and lesson persistence, and expose them viaautoma_ai.learning.__init__.YamlAgentSpecinautoma_ai.config.agent_specthat parses versioned YAML, maps fields to existingAgentFactorykwargs, and constructs agents without changing Python paths.AGENT_FACTORY_CONFIG_SURFACE.md, example agent specs forreflection_agentandlesson_agent, and update tool/blackboard/skill wiring to passlearning_configthrough.Testing
tests/learning/:test_agent_spec.py,test_learning_mode.py, andtest_learning_payload.pyusingpytest, which validate YAML parsing/mapping, learning scheduling behavior, background failure handling, and payload normalization.Codex Task