Skip to content

akarazhev/llm-multi-agent-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

53 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLM Multi-Agent System

Python 3.12 License: MIT Code style: black

A production-ready multi-agent orchestration system powered by local LLMs for automated software development workflows.

⚠️ Important: Python 3.12 Required

This project requires Python 3.12.x

  • Install: brew install python@3.12 (macOS)
  • Create venv: python3.12 -m venv venv
  • See: START_HERE.md for complete setup

Python 3.14 and other versions are NOT supported due to LangChain/LangGraph compatibility.

🎯 Overview

This system orchestrates specialized AI agents that collaborate to handle complete software development lifecyclesβ€”from requirements analysis to deployment and documentation. Built with privacy-first principles, it runs 100% locally using llama.cpp, ensuring no data leaves your machine.

Key Features

  • πŸ€– 5 Specialized AI Agents - Business Analyst, Developer, QA Engineer, DevOps Engineer, Technical Writer
  • πŸ’¬ Interactive Chat Display - Watch agents communicate in real-time with color-coded chat interface
  • πŸ”„ LangGraph Orchestration - Advanced workflow engine with parallel execution and state persistence
  • ⚑ Parallel Agent Execution - QA and DevOps run simultaneously (30-40% faster)
  • πŸ’Ύ State Persistence - Resume interrupted workflows from checkpoints
  • 🏠 100% Local Execution - No cloud APIs, complete data privacy, zero costs
  • πŸ“‹ Flexible Workflow Engine - Custom workflows or use predefined templates
  • πŸ”§ Production-Ready - Comprehensive error handling, logging, and monitoring
  • πŸ“Š Real-time Status Tracking - Monitor agent progress and task completion with visual progress bars
  • πŸ§ͺ Fully Tested - Comprehensive test suite included

πŸš€ Production Enhancements (New!)

  • ⚑ Streaming Responses - Real-time token streaming for immediate feedback (enabled by default)
  • πŸ”„ Retry Logic - Exponential backoff with jitter for transient failure recovery
  • πŸ›‘οΈ Circuit Breaker - Prevents cascade failures with automatic recovery detection
  • πŸ”Œ Connection Pooling - Efficient connection reuse with health monitoring
  • πŸ“ Structured Logging - JSON-formatted logs with correlation IDs for traceability
  • πŸ“Š Metrics Collection - Built-in performance monitoring and statistics
  • βœ… Config Validation - Comprehensive validation with clear error messages
  • 🎯 Enhanced System Prompts - Professional, production-ready prompts for all agents

See: Production-Ready Guide | Migration Guide

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Multi-Agent Orchestrator                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚   Workflow   β”‚  β”‚   Agent    β”‚  β”‚     Task     β”‚        β”‚
β”‚  β”‚    Engine    │←→│ Orchestrator│←→│   Manager    β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                    β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
β”‚Businessβ”‚  β”‚  Developer  β”‚  β”‚   QA    β”‚  β”‚  DevOps  β”‚
β”‚Analyst β”‚  β”‚    Agent    β”‚  β”‚ Engineerβ”‚  β”‚ Engineer β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  Local llama-server  β”‚
         β”‚   (llama.cpp)        β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Prerequisites

  • Python 3.12 (required - install with brew install python@3.12)
  • llama.cpp installed (with llama-server)
  • 16GB+ RAM recommended
  • macOS, Linux, or Windows

Installation

# Clone the repository
git clone https://github.com/yourusername/llm-multi-agent-system.git
cd llm-multi-agent-system

# Run automated setup
python setup.py

This will:

  1. Check Python version
  2. Verify llama.cpp installation
  3. Create virtual environment
  4. Install dependencies
  5. Set up configuration files

Manual Setup (Alternative)

# Create virtual environment with Python 3.12
python3.12 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies (inside venv, use python/pip)
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your settings

# Start local LLM server (ensure llama-server is running on port 8080)

# Run the system (inside venv, use python)
python main.py

Configuration

Edit config.yaml:

workspace: "."  # Your workspace path
log_level: "INFO"

# LLM Configuration (Production-Ready)
llm_timeout: 300
llm_max_retries: 3
llm_retry_initial_delay: 1.0
llm_retry_max_delay: 60.0
llm_circuit_breaker_threshold: 5
llm_circuit_breaker_timeout: 60.0
llm_stream_responses: true  # Enable streaming

# Orchestration
max_concurrent_agents: 5

# Monitoring & Logging
enable_structured_logging: true
enable_metrics: true

# Agent configurations
agents:
  developer:
    languages: [python, javascript, typescript]
  qa_engineer:
    test_frameworks: [pytest, jest, playwright]

Edit .env:

# LLM Server Configuration
OPENAI_API_BASE=http://127.0.0.1:8080/v1
OPENAI_API_KEY=not-needed
OPENAI_API_MODEL=devstral
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=2048

# Retry & Resilience
LLM_MAX_RETRIES=3
LLM_RETRY_INITIAL_DELAY=1.0
LLM_RETRY_MAX_DELAY=60.0
LLM_CIRCUIT_BREAKER_THRESHOLD=5
LLM_CIRCUIT_BREAKER_TIMEOUT=60.0

# Logging
LOG_LEVEL=INFO
STRUCTURED_LOGGING=true

# See .env.example for full configuration options

πŸ’‘ Usage

Interactive Mode

python main.py

You'll be prompted to:

  1. Enter your requirement
  2. Select a workflow type
  3. Monitor execution
  4. Review results in output/ directory

Interactive Chat Display (New! ✨)

Watch agents communicate in real-time with our new interactive chat interface:

# Run the interactive example
python examples/interactive_chat_workflow.py

Features:

  • πŸ’¬ Color-coded agent messages and thoughts
  • πŸ”„ Visual handoffs between agents
  • πŸ“Š Real-time progress bars
  • βœ… Task completion summaries
  • πŸ“„ File operation tracking
  • πŸ“ Automatic chat log export

Example Output:

πŸ€” Business Analyst:
  Analyzing requirements for task management API...
  Identifying user stories and acceptance criteria.

βœ… Business Analyst completed task
  Created 8 user stories with 24 acceptance criteria
  πŸ“„ Files created: 2
    β€’ requirements.md
    β€’ user_stories.md

πŸ”„ Business Analyst β†’ Developer
  Requirements complete. Passing user stories for design.

Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 40%

See Interactive Chat Guide for full details.

LangGraph Orchestration (Recommended)

New! Use LangGraph for advanced features like parallel execution and state persistence:

import asyncio
from src.orchestrator.langgraph_orchestrator import LangGraphOrchestrator

async def main():
    # Initialize orchestrator with interactive chat display
    orchestrator = LangGraphOrchestrator(
        workspace=".",
        enable_chat_display=True  # Watch agents communicate!
    )
    
    # Execute with parallel QA + DevOps (30-40% faster)
    result = await orchestrator.execute_feature_development(
        requirement="Create REST API for user authentication with JWT",
        context={
            "language": "python",
            "framework": "fastapi"
        }
    )
    
    print(f"Workflow completed: {result['status']}")
    print(f"Files created: {len(result['files_created'])}")

asyncio.run(main())

Run it:

# Inside virtual environment (use python)
python examples/langgraph_feature_development.py

# Outside virtual environment (use python3)
python3 examples/langgraph_feature_development.py

Key Benefits:

  • ⚑ 30-40% faster with parallel execution
  • πŸ’¬ Interactive chat display (enabled by default)
  • πŸ’Ύ Resume interrupted workflows
  • πŸ”€ Smart conditional routing
  • πŸ“Š Workflow visualization

See LangGraph Integration Guide for details.

LangGraph Orchestration

import asyncio
from src.orchestrator import LangGraphOrchestrator

async def main():
    orchestrator = LangGraphOrchestrator(workspace=".")
    
    final_state = await orchestrator.execute_feature_development(
        requirement="Create REST API for user authentication with JWT",
        context={
            "language": "python",
            "framework": "fastapi"
        }
    )
    
    # Extract the actual state from the event dict
    actual_state = list(final_state.values())[0] if final_state else {}
    print(f"Workflow completed: {len(actual_state.get('completed_steps', []))} steps")
    print(f"Status: {actual_state.get('status', 'N/A')}")

asyncio.run(main())

Examples

Run example workflows:

# Inside virtual environment (recommended)
source venv/bin/activate
python examples/langgraph_feature_development.py  # Parallel execution demo
python examples/langgraph_bug_fix.py              # Bug fix workflow
python examples/langgraph_resume_workflow.py      # Resume interrupted workflow
python examples/visualize_workflow.py             # Generate workflow diagrams

# Outside virtual environment (use python3)
python3 examples/langgraph_feature_development.py
python3 examples/simple_workflow.py
python examples/custom_workflow.py
python examples/ecommerce_catalog.py
python examples/agent_status_monitor.py

πŸ“‹ Workflow Types

1. Feature Development

Complete feature implementation from requirements to deployment.

Steps:

  1. Business Analyst - Requirements analysis
  2. Developer - Architecture design
  3. Developer - Implementation
  4. QA Engineer - Testing
  5. DevOps Engineer - Deployment setup
  6. Technical Writer - Documentation

2. Bug Fix

Focused bug resolution with testing and documentation.

Steps:

  1. QA Engineer - Bug analysis and reproduction
  2. Developer - Fix implementation
  3. QA Engineer - Regression testing
  4. Technical Writer - Release notes

3. Infrastructure

Infrastructure design, implementation, and documentation.

Steps:

  1. DevOps Engineer - Infrastructure design
  2. DevOps Engineer - IaC implementation
  3. QA Engineer - Infrastructure testing
  4. Technical Writer - Operations documentation

4. Documentation

Comprehensive documentation creation and review.

Steps:

  1. Business Analyst - Documentation requirements
  2. Technical Writer - Documentation creation
  3. Developer - Technical review

5. Analysis

Feasibility studies and technical analysis.

Steps:

  1. Business Analyst - Requirements gathering
  2. Developer - Technical feasibility
  3. DevOps Engineer - Infrastructure assessment
  4. Business Analyst - Final analysis report

πŸ€– Agent Capabilities

Business Analyst

  • Requirements analysis and documentation
  • User story creation
  • Acceptance criteria definition
  • Feasibility assessment

Developer

  • Code implementation (Python, JavaScript, TypeScript)
  • Architecture design
  • Code review
  • Technical documentation
  • Supports: FastAPI, React, Django, Node.js

QA Engineer

  • Test suite creation
  • Test execution
  • Bug reporting
  • Quality metrics
  • Supports: pytest, Jest, Playwright

DevOps Engineer

  • Infrastructure as Code
  • CI/CD pipeline configuration
  • Deployment automation
  • Monitoring setup
  • Supports: Docker, Kubernetes, AWS, GitLab CI

Technical Writer

  • API documentation
  • User guides
  • Release notes
  • Operations manuals
  • Formats: Markdown, Confluence, OpenAPI

πŸ§ͺ Testing

# Run all tests
python -m pytest tests/ -v

# Run specific test
python -m pytest tests/test_agent.py

# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html

# Run individual test files
python tests/simple_test.py
python tests/test_file_writer.py

Test Coverage:

  • Agent functionality and workflows
  • File writer and format handling
  • Response parsing and extraction
  • Edge cases and error handling
  • Integration tests

πŸ“ Project Structure

llm-multi-agent-system/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agents/              # Agent implementations
β”‚   β”‚   β”œβ”€β”€ base_agent.py    # Base agent class
β”‚   β”‚   β”œβ”€β”€ business_analyst.py
β”‚   β”‚   β”œβ”€β”€ developer.py
β”‚   β”‚   β”œβ”€β”€ qa_engineer.py
β”‚   β”‚   β”œβ”€β”€ devops_engineer.py
β”‚   β”‚   └── technical_writer.py
β”‚   β”œβ”€β”€ orchestrator/        # Orchestration logic
β”‚   β”‚   β”œβ”€β”€ agent_orchestrator.py
β”‚   β”‚   β”œβ”€β”€ task_manager.py
β”‚   β”‚   └── workflow_engine.py
β”‚   β”œβ”€β”€ config/              # Configuration
β”‚   β”‚   └── settings.py
β”‚   └── utils/               # Utilities
β”‚       └── file_writer.py
β”œβ”€β”€ tests/                   # Test suite
β”œβ”€β”€ examples/                # Example scripts
β”œβ”€β”€ docs/                    # Documentation
β”œβ”€β”€ scripts/                 # Utility scripts
β”œβ”€β”€ output/                  # Generated outputs
β”œβ”€β”€ logs/                    # Log files
β”œβ”€β”€ config.yaml              # Main configuration
β”œβ”€β”€ .env.example             # Environment template
β”œβ”€β”€ requirements.txt         # Dependencies
β”œβ”€β”€ setup.py                 # Setup script
└── main.py                  # Entry point

πŸ“š Documentation

Comprehensive guides in the docs/ directory:

πŸ”’ Privacy & Security

100% Local Execution

  • All processing happens on your machine
  • No data sent to external services
  • No API keys required (except dummy for local server)
  • No internet connection needed after model download

Security Features

  • Local-only llama-server binding (127.0.0.1)
  • No external network calls
  • Workspace isolation
  • Secure file handling

πŸŽ›οΈ Configuration

Environment Variables (.env)

# Local LLM Server (REQUIRED)
OPENAI_API_BASE=http://127.0.0.1:8080/v1
OPENAI_API_KEY=not-needed
OPENAI_API_MODEL=devstral

# Optional: Workspace override
WORKSPACE=/path/to/workspace

YAML Configuration (config.yaml)

# Workspace settings
workspace: "."
output_directory: "./output"
log_level: "INFO"
log_file: "logs/agent_system.log"

# Execution settings
llm_timeout: 300
max_concurrent_agents: 5
task_retry_attempts: 3
task_timeout: 600

# Agent-specific configurations
agents:
  developer:
    enabled: true
    languages: [python, javascript, typescript]
  qa_engineer:
    enabled: true
    test_frameworks: [pytest, jest, playwright]

πŸ› οΈ Troubleshooting

Common Issues

"OPENAI_API_BASE not configured"

echo "OPENAI_API_BASE=http://127.0.0.1:8080/v1" >> .env

"Connection refused"

# Ensure your local LLM server is running on port 8080

Task timeout

# Increase in config.yaml
llm_timeout: 600
task_timeout: 900

For more solutions, see TROUBLESHOOTING.md.

🌟 Use Cases

  • Rapid Prototyping - Quickly generate full-stack applications
  • Code Generation - Automated implementation of features
  • Documentation - Auto-generate comprehensive docs
  • Testing - Create complete test suites
  • Infrastructure - Generate IaC and deployment configs
  • Analysis - Technical feasibility studies
  • Learning - Study AI-generated implementations

🚦 System Requirements

Minimum

  • Python 3.12 (required)
  • 16GB RAM
  • 8-core CPU
  • 50GB disk space

Recommended

  • Python 3.12 (required)
  • 32GB+ RAM
  • Apple Silicon (M1/M2/M3) or NVIDIA GPU
  • 100GB+ disk space

πŸ“Š Performance

  • Initial Response: < 5 seconds
  • Agent Execution: 10-60 seconds per agent
  • Full Workflow: 5-30 minutes (complexity dependent)
  • Model Loading: 30-60 seconds (first run)

🀝 Contributing

Contributions are welcome! Please read CONTRIBUTING.md for guidelines.

Development Setup

# Clone repo
git clone https://github.com/yourusername/llm-multi-agent-system.git
cd llm-multi-agent-system

# Create virtual environment with Python 3.12
python3.12 -m venv venv
source venv/bin/activate  # Linux/macOS
# venv\Scripts\activate   # Windows

# Install dev dependencies (inside venv, use python/pip)
pip install -r requirements.txt
pip install -r requirements-dev.txt

# Run tests
pytest tests/ -v

# Format code
black src/ tests/

# Lint code
flake8 src/ tests/

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • llama.cpp - Local LLM inference
  • Devstral - Default coding model
  • OpenAI - API compatibility standard
  • FastAPI - (Example framework in generated code)

πŸ“ž Support

πŸ—ΊοΈ Roadmap

  • Web UI for workflow management
  • Additional agent types (Security, Data Engineer)
  • Workflow visualization
  • Integration with popular tools (Jira, Confluence)
  • Multi-language support for prompts
  • Workflow templates marketplace
  • Real-time collaboration features

πŸ“ˆ Version History

See CHANGELOG.md for detailed version history.


Built with ❀️ for developers who value privacy and local control.

For questions, issues, or contributions, please visit our GitHub repository.

About

SDLC 2.0 LLM Multi-Agent System

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published