Chimera

Overview

The chimera repository is a powerful, modular platform designed to automate the end-to-end creation of expert-level LinkedIn content campaigns. It leverages advanced AI techniques, including Large Language Models (LLMs), agent-oriented workflows, and integration with external APIs and knowledge bases, to orchestrate the planning, research, writing, and scheduling of LinkedIn posts. The system is highly configurable, supporting both generalist and product-focused content strategies, and is suitable for marketing professionals, content creators, agencies, and organizations aiming to scale and standardize their LinkedIn content output.

Key capabilities include:

Automated creation of editorial strategies tailored to a user’s topic and audience.
Generation of research-backed LinkedIn posts, each complete with references and best-practice formatting.
Production of AI-generated visual prompts for tools like DALL-E to create engaging post imagery.
Construction of content calendars, mapping posts to optimal publishing dates and performance metrics.
Extensible integration with web search, image generation, and vector-based knowledge retrieval.

Role of internal modules:

Orchestration Modules: LinkedInCrew and LinkedInWorkflow serve as the central orchestrators, managing workflow steps, agent/tool assignment, and event-driven data handoff.
Agent/Tool Adapters: Configured via YAML and environment variables, these enable plug-and-play integration with web search, knowledge bases, and image generation services.
Utility and Configuration Modules: Provide robust file I/O, configuration validation, environment management, and fallback logic for missing dependencies.

Technology Stack

Language: Python 3.x
Frameworks:
- llama_index: For agent orchestration, memory management, workflow/pipeline definition.
Libraries:
- requests, BeautifulSoup: Web scraping and HTML parsing.
- PyYAML (yaml): Configuration file parsing.
- python-dotenv (dotenv): Environment variable management.
- Qdrant: Vector search database for knowledge management.
- OpenAI/DALL-E: AI-powered image generation.
- duckduckgo_search/ddgs: Web search API integration.
Tools:
- Custom Utilities: utils.utils, utils.storage_config, utils.memory_config for LLM config, memory abstraction, and output management.
- External Service Adapters: tools.duckduckgo_tool, tools.dalle_tool, tools.qdrant_tool for external research and content generation.

Directory Structure

The project is structured for clarity and modularity, facilitating easy extension and maintenance.

chimera/
├── src/
│   └── chimera/
│       ├── main.py                   # Application entry point and CLI interface
│       ├── crew_llamaindex.py        # Core workflow orchestration and pipeline logic
│       ├── tools/
│       │   ├── duckduckgo_tool.py    # DuckDuckGo web search integration
│       │   ├── dalle_tool.py         # DALL-E image generation and download adapters
│       │   └── qdrant_tool.py        # Qdrant knowledge base integration
│       ├── utils/
│       │   ├── utils.py              # Utility functions (LLM config, file I/O, etc.)
│       │   ├── storage_config.py     # Memory/knowledge base configuration helpers
│       │   └── memory_config.py      # Centralized Qdrant/embedding config and CRUD
│       └── ...
├── config/
│   ├── agents.yaml                   # Agent configuration (roles, goals, tools)
│   ├── tasks.yaml                    # Task configuration (descriptions, expected output)
├── templates/
│   └── linkedin_post_preview.html    # HTML template for post preview rendering
├── output/                           # Generated output (strategy, posts, images, calendars)
│   └── run_*/                        # Timestamped output directories per pipeline run
└── README.md                         # Project documentation (this file)

Getting Started

Prerequisites

Python 3.8 or newer is required.
Install all dependencies (preferably in a virtual environment):
- llama_index
- requests
- beautifulsoup4
- pyyaml
- python-dotenv
- qdrant-client
- duckduckgo-search or ddgs
- openai
Install with pip:
```
pip install llama_index requests beautifulsoup4 pyyaml python-dotenv qdrant-client duckduckgo-search openai
```
External Service Access:
- To enable all features, you need valid API keys for OpenAI (DALL-E), and (optionally) Qdrant and DuckDuckGo (if running in certain environments).
Configuration Files:
- Ensure config/agents.yaml and config/tasks.yaml exist with appropriate agent/task definitions.
- Prepare a .env file (see below for required variables).

Cloning the Repository

Clone the repository from GitHub:

git clone https://github.com/ArchAI-Labs/chimera.git
cd chimera

Environment Setup

Create a .env file in the root directory with entries such as:
```
PROVIDER=ollama
MODEL=llama3
OPENAI_API_KEY=your-openai-key
QDRANT_COLLECTION=linkedin_knowledge
PRODUCT_SITES=https://example.com/product1,https://example.com/product2
```
- PROVIDER: LLM provider (ollama, openai, or anthropic).
- MODEL: Model name for the LLM.
- OPENAI_API_KEY: Required for DALL-E image generation.
- PRODUCT_SITES: (Optional) Comma-separated URLs for product knowledge ingestion.
Install dependencies as described above.

Running the Pipeline

Launch the CLI interface:
```
cd src/chimera
python main.py
```
The system will guide you through input collection (expert type, topic, number of posts, posting frequency), validate your environment, and execute the content generation workflow.
Outputs will be saved in a timestamped output/run_*/ directory with all generated files (strategy, posts, images, calendar).

Module Usage

`crew_llamaindex.py` (Pipeline Orchestration)

Purpose: Drives the entire content creation process, manages agents, tools, and workflow execution.
Integration: Import and instantiate LinkedInCrew with input parameters, then invoke run() or kickoff() to start the pipeline.
Configuration: Relies on both environment variables and YAML config files for flexibility.

`main.py` (CLI/Entrypoint)

Purpose: Provides the user interface for collecting input, validating configuration, and triggering the pipeline.
Usage: Run as a script; it manages environment setup, input collection, and invokes the orchestrator.

`tools/` (External API Adapters)

Purpose: Encapsulate integration with external services like DuckDuckGo, DALL-E, and Qdrant.
Usage: Used internally by agents via the FunctionTool adapter pattern; can be extended for new tools.

`utils/` (Utility and Config Management)

Purpose: Utility functions for environment validation, file I/O, LLM configuration, and memory/knowledge management.
Usage: Imported as dependencies across orchestration and tool modules.

Example: External integration

To use the workflow as a library, import LinkedInCrew and run the pipeline with your parameters:

Ensure .env and YAML configs are set up.
Prepare your input dictionary (e.g., {"topic": "AI in Healthcare", "num_posts": 5, ...}).
Instantiate LinkedInCrew(inputs=your_inputs) and call run().

Functional Analysis

1. Main Responsibilities of the System

Automated Content Campaign Generation:
Orchestrates the full lifecycle of a LinkedIn campaign—from editorial planning, through post creation and visual prompt generation, to scheduling and artifact output.
Agent-Driven Workflow:
Delegates subtasks to specialized agents (e.g., research, writing, design, planning), each configured via YAML and empowered by a suite of tools.
Research Integration:
Enables both web-based and knowledge base-backed research, ensuring content is authoritative and relevant.
Output Management:
Produces ready-to-use Markdown, HTML, and text artifacts for user review and publishing.

2. Problems the System Solves

Reduces manual effort and expertise required to plan, research, and produce LinkedIn campaigns.
Ensures content is up-to-date by integrating web and internal research sources.
Bridges the gap between textual and visual content creation via AI-powered image prompts.
Standardizes campaign output for consistency, quality, and repeatability.

3. Interaction of Modules and Components

User Input → Orchestrator (LinkedInCrew):
All workflow parameters are collected at the start and injected as context.
Config Loading:
YAML configs are loaded for agents and tasks, providing instructions and context for each workflow step.
Tool Initialization:
Tools (web search, file writing, image generation, knowledge base) are wrapped as FunctionTool objects and injected into agent instances.
Agent Execution:
Each workflow step invokes the appropriate agent, passing prompts and context, and handling results as typed events.
Event-Driven Workflow:
Steps are decorated with @step and pass typed event objects between them, ensuring strong interface contracts.
Output Artifacts:
Final outputs are saved to disk, and HTML previews are generated for easy user validation.

4. User-Facing vs. System-Facing Functionalities

User-Facing:
- CLI for input and confirmation.
- Output files (Markdown, HTML previews, images, calendar) for publishing and review.
- Clear logging and user feedback during each step.
System-Facing:
- Tool and agent abstractions for modularity and extensibility.
- Config and environment management for portability.
- Error handling, fallback logic, and health checks for robustness.

Interface/Abstract Classes

Event Base Class:
All workflow step outputs inherit from a common Event, ensuring consistent data handoff and enabling type safety across the workflow.
FunctionTool Wrapper:
Every tool is standardized using the FunctionTool interface, enforcing a consistent callable signature for agent consumption.

Architectural Patterns and Design Principles Applied

Facade Pattern:
LinkedInCrew serves as the central orchestrator, simplifying the management of agents, tools, and workflow for external callers.
Factory Pattern:
LLMs, agents, and tools are instantiated based on configuration and environment, enabling provider-agnostic operation.
Adapter Pattern:
Tools are wrapped as FunctionTool objects, allowing interchangeable use regardless of backend.
Event Sourcing/DTO Pattern:
Workflow step outputs are strongly-typed event classes, ensuring clarity and traceability of data flow.
Null Object Pattern:
Fallback tool implementations ensure the system degrades gracefully if dependencies are missing.
Config-Driven Architecture:
Behavior is determined by YAML files and environment variables, not hardcoded logic.
Pipeline/Workflow Pattern:
Each workflow step is atomic, composable, and event-driven, supporting robust orchestration and easy extension.
Separation of Concerns:
Clear boundaries between configuration, workflow orchestration, agent/tool logic, and utility functions.
Dependency Injection:
Tools, agents, and configs are passed as dependencies, supporting testability and modularity.
Defensive Programming:
Extensive error handling, output validation, and fallback mechanisms are in place.

Code Quality Analysis

SonarQube and Analyst-Derived Metrics:

Bugs:
- No explicit critical bugs identified in the current review; however, aggressive suppression of warnings may hide important issues.
Vulnerabilities:
- No direct vulnerabilities detected, but warning suppression and lack of input validation (e.g., for URLs, file paths) could expose latent security risks.
Code Smells:
- Warning suppression at the global level is a significant code smell, as it may mask deprecations, performance issues, or runtime failures.
- Manual parsing of LLM output using regular expressions is fragile and may break with changes in LLM output format.
- Scattered fallback logic for tool implementations increases the risk of maintenance errors and interface divergence.
Code Coverage:
- No explicit tests are visible in the provided files; key areas such as config loading, event parsing, and fallback logic likely lack automated test coverage.
Duplication:
- Some redundancy in fallback logic for tool initialization and error handling, but overall code duplication appears moderate.

Implications:

Maintainability:
While modular and clear, maintainability could be hampered by warning suppression, fragile parsing, and distributed fallback logic.
Reliability:
Fallbacks and defensive programming increase robustness, but risks remain due to error silencing.
Security:
Absence of strict input validation and warning suppression may allow security issues to go undetected.
Scalability:
The system is architecturally scalable due to modularity, but testing and validation should be strengthened to support large-scale deployments.

Weaknesses and Areas for Improvement

The following concrete TODOs are derived from code analysis and SonarQube/code quality findings:

Further Areas of Investigation

Performance and Scalability:
Investigate the performance of web scraping and large-scale knowledge ingestion, especially for large or slow product sites. Benchmark Qdrant ingestion and retrieval at scale.
Security Hardening:
Review all input surfaces for injection risks, add validation/sanitization, and audit file I/O for path traversal vulnerabilities.
Test Coverage Analysis:
Identify all untested modules and prioritize test development for the most complex or fragile code paths.
Monitoring and Health Checks:
Develop infrastructure for monitoring agent/tool health, workflow execution status, and error rates for production deployments.
Extensibility for New Channels:
Explore integration with other social media platforms, direct publishing APIs, or analytics feedback loops.
User Management and Access Control:
Consider adding multi-user support, authentication, and role-based access control for collaborative environments.
Advanced Analytics:
Investigate adding campaign performance tracking, feedback collection, and self-improving content optimization loops.

Attribution

Generated with the support of ArchAI, an automated documentation system.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
img		img
src/chimera		src/chimera
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chimera

Overview

Technology Stack

Directory Structure

Getting Started

Prerequisites

Cloning the Repository

Environment Setup

Running the Pipeline

Module Usage

`crew_llamaindex.py` (Pipeline Orchestration)

`main.py` (CLI/Entrypoint)

`tools/` (External API Adapters)

`utils/` (Utility and Config Management)

Functional Analysis

1. Main Responsibilities of the System

2. Problems the System Solves

3. Interaction of Modules and Components

4. User-Facing vs. System-Facing Functionalities

Interface/Abstract Classes

Architectural Patterns and Design Principles Applied

Code Quality Analysis

Weaknesses and Areas for Improvement

Further Areas of Investigation

Attribution

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

License

ArchAI-Labs/chimera

Folders and files

Latest commit

History

Repository files navigation

Chimera

Overview

Technology Stack

Directory Structure

Getting Started

Prerequisites

Cloning the Repository

Environment Setup

Running the Pipeline

Module Usage

crew_llamaindex.py (Pipeline Orchestration)

main.py (CLI/Entrypoint)

tools/ (External API Adapters)

utils/ (Utility and Config Management)

Functional Analysis

1. Main Responsibilities of the System

2. Problems the System Solves

3. Interaction of Modules and Components

4. User-Facing vs. System-Facing Functionalities

Interface/Abstract Classes

Architectural Patterns and Design Principles Applied

Code Quality Analysis

Weaknesses and Areas for Improvement

Further Areas of Investigation

Attribution

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

`crew_llamaindex.py` (Pipeline Orchestration)

`main.py` (CLI/Entrypoint)

`tools/` (External API Adapters)

`utils/` (Utility and Config Management)

Packages