A production-ready multi-agent orchestration system powered by local LLMs for automated software development workflows.
This project requires Python 3.12.x
- Install:
brew install python@3.12(macOS) - Create venv:
python3.12 -m venv venv - See:
START_HERE.mdfor complete setup
Python 3.14 and other versions are NOT supported due to LangChain/LangGraph compatibility.
This system orchestrates specialized AI agents that collaborate to handle complete software development lifecyclesβfrom requirements analysis to deployment and documentation. Built with privacy-first principles, it runs 100% locally using llama.cpp, ensuring no data leaves your machine.
- π€ 5 Specialized AI Agents - Business Analyst, Developer, QA Engineer, DevOps Engineer, Technical Writer
- π¬ Interactive Chat Display - Watch agents communicate in real-time with color-coded chat interface
- π LangGraph Orchestration - Advanced workflow engine with parallel execution and state persistence
- β‘ Parallel Agent Execution - QA and DevOps run simultaneously (30-40% faster)
- πΎ State Persistence - Resume interrupted workflows from checkpoints
- π 100% Local Execution - No cloud APIs, complete data privacy, zero costs
- π Flexible Workflow Engine - Custom workflows or use predefined templates
- π§ Production-Ready - Comprehensive error handling, logging, and monitoring
- π Real-time Status Tracking - Monitor agent progress and task completion with visual progress bars
- π§ͺ Fully Tested - Comprehensive test suite included
- β‘ Streaming Responses - Real-time token streaming for immediate feedback (enabled by default)
- π Retry Logic - Exponential backoff with jitter for transient failure recovery
- π‘οΈ Circuit Breaker - Prevents cascade failures with automatic recovery detection
- π Connection Pooling - Efficient connection reuse with health monitoring
- π Structured Logging - JSON-formatted logs with correlation IDs for traceability
- π Metrics Collection - Built-in performance monitoring and statistics
- β Config Validation - Comprehensive validation with clear error messages
- π― Enhanced System Prompts - Professional, production-ready prompts for all agents
See: Production-Ready Guide | Migration Guide
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Multi-Agent Orchestrator β
β ββββββββββββββββ ββββββββββββββ ββββββββββββββββ β
β β Workflow β β Agent β β Task β β
β β Engine ββββ Orchestratorββββ Manager β β
β ββββββββββββββββ ββββββββββββββ ββββββββββββββββ β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββ΄βββββββββββ
β β
βββββΌββββ ββββββββΌβββββββ ββββββΌβββββ ββββββΌββββββ
βBusinessβ β Developer β β QA β β DevOps β
βAnalyst β β Agent β β Engineerβ β Engineer β
ββββββββββ βββββββββββββββ βββββββββββ ββββββββββββ
β
ββββββββββββΌββββββββββββ
β Local llama-server β
β (llama.cpp) β
ββββββββββββββββββββββββ
- Python 3.12 (required - install with
brew install python@3.12) - llama.cpp installed (with llama-server)
- 16GB+ RAM recommended
- macOS, Linux, or Windows
# Clone the repository
git clone https://github.com/yourusername/llm-multi-agent-system.git
cd llm-multi-agent-system
# Run automated setup
python setup.pyThis will:
- Check Python version
- Verify llama.cpp installation
- Create virtual environment
- Install dependencies
- Set up configuration files
# Create virtual environment with Python 3.12
python3.12 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies (inside venv, use python/pip)
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your settings
# Start local LLM server (ensure llama-server is running on port 8080)
# Run the system (inside venv, use python)
python main.pyEdit config.yaml:
workspace: "." # Your workspace path
log_level: "INFO"
# LLM Configuration (Production-Ready)
llm_timeout: 300
llm_max_retries: 3
llm_retry_initial_delay: 1.0
llm_retry_max_delay: 60.0
llm_circuit_breaker_threshold: 5
llm_circuit_breaker_timeout: 60.0
llm_stream_responses: true # Enable streaming
# Orchestration
max_concurrent_agents: 5
# Monitoring & Logging
enable_structured_logging: true
enable_metrics: true
# Agent configurations
agents:
developer:
languages: [python, javascript, typescript]
qa_engineer:
test_frameworks: [pytest, jest, playwright]Edit .env:
# LLM Server Configuration
OPENAI_API_BASE=http://127.0.0.1:8080/v1
OPENAI_API_KEY=not-needed
OPENAI_API_MODEL=devstral
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=2048
# Retry & Resilience
LLM_MAX_RETRIES=3
LLM_RETRY_INITIAL_DELAY=1.0
LLM_RETRY_MAX_DELAY=60.0
LLM_CIRCUIT_BREAKER_THRESHOLD=5
LLM_CIRCUIT_BREAKER_TIMEOUT=60.0
# Logging
LOG_LEVEL=INFO
STRUCTURED_LOGGING=true
# See .env.example for full configuration optionspython main.pyYou'll be prompted to:
- Enter your requirement
- Select a workflow type
- Monitor execution
- Review results in
output/directory
Watch agents communicate in real-time with our new interactive chat interface:
# Run the interactive example
python examples/interactive_chat_workflow.pyFeatures:
- π¬ Color-coded agent messages and thoughts
- π Visual handoffs between agents
- π Real-time progress bars
- β Task completion summaries
- π File operation tracking
- π Automatic chat log export
Example Output:
π€ Business Analyst:
Analyzing requirements for task management API...
Identifying user stories and acceptance criteria.
β
Business Analyst completed task
Created 8 user stories with 24 acceptance criteria
π Files created: 2
β’ requirements.md
β’ user_stories.md
π Business Analyst β Developer
Requirements complete. Passing user stories for design.
Progress: ββββββββββββββββββββββββββββββββββββ 40%
See Interactive Chat Guide for full details.
New! Use LangGraph for advanced features like parallel execution and state persistence:
import asyncio
from src.orchestrator.langgraph_orchestrator import LangGraphOrchestrator
async def main():
# Initialize orchestrator with interactive chat display
orchestrator = LangGraphOrchestrator(
workspace=".",
enable_chat_display=True # Watch agents communicate!
)
# Execute with parallel QA + DevOps (30-40% faster)
result = await orchestrator.execute_feature_development(
requirement="Create REST API for user authentication with JWT",
context={
"language": "python",
"framework": "fastapi"
}
)
print(f"Workflow completed: {result['status']}")
print(f"Files created: {len(result['files_created'])}")
asyncio.run(main())Run it:
# Inside virtual environment (use python)
python examples/langgraph_feature_development.py
# Outside virtual environment (use python3)
python3 examples/langgraph_feature_development.pyKey Benefits:
- β‘ 30-40% faster with parallel execution
- π¬ Interactive chat display (enabled by default)
- πΎ Resume interrupted workflows
- π Smart conditional routing
- π Workflow visualization
See LangGraph Integration Guide for details.
import asyncio
from src.orchestrator import LangGraphOrchestrator
async def main():
orchestrator = LangGraphOrchestrator(workspace=".")
final_state = await orchestrator.execute_feature_development(
requirement="Create REST API for user authentication with JWT",
context={
"language": "python",
"framework": "fastapi"
}
)
# Extract the actual state from the event dict
actual_state = list(final_state.values())[0] if final_state else {}
print(f"Workflow completed: {len(actual_state.get('completed_steps', []))} steps")
print(f"Status: {actual_state.get('status', 'N/A')}")
asyncio.run(main())Run example workflows:
# Inside virtual environment (recommended)
source venv/bin/activate
python examples/langgraph_feature_development.py # Parallel execution demo
python examples/langgraph_bug_fix.py # Bug fix workflow
python examples/langgraph_resume_workflow.py # Resume interrupted workflow
python examples/visualize_workflow.py # Generate workflow diagrams
# Outside virtual environment (use python3)
python3 examples/langgraph_feature_development.py
python3 examples/simple_workflow.py
python examples/custom_workflow.py
python examples/ecommerce_catalog.py
python examples/agent_status_monitor.pyComplete feature implementation from requirements to deployment.
Steps:
- Business Analyst - Requirements analysis
- Developer - Architecture design
- Developer - Implementation
- QA Engineer - Testing
- DevOps Engineer - Deployment setup
- Technical Writer - Documentation
Focused bug resolution with testing and documentation.
Steps:
- QA Engineer - Bug analysis and reproduction
- Developer - Fix implementation
- QA Engineer - Regression testing
- Technical Writer - Release notes
Infrastructure design, implementation, and documentation.
Steps:
- DevOps Engineer - Infrastructure design
- DevOps Engineer - IaC implementation
- QA Engineer - Infrastructure testing
- Technical Writer - Operations documentation
Comprehensive documentation creation and review.
Steps:
- Business Analyst - Documentation requirements
- Technical Writer - Documentation creation
- Developer - Technical review
Feasibility studies and technical analysis.
Steps:
- Business Analyst - Requirements gathering
- Developer - Technical feasibility
- DevOps Engineer - Infrastructure assessment
- Business Analyst - Final analysis report
- Requirements analysis and documentation
- User story creation
- Acceptance criteria definition
- Feasibility assessment
- Code implementation (Python, JavaScript, TypeScript)
- Architecture design
- Code review
- Technical documentation
- Supports: FastAPI, React, Django, Node.js
- Test suite creation
- Test execution
- Bug reporting
- Quality metrics
- Supports: pytest, Jest, Playwright
- Infrastructure as Code
- CI/CD pipeline configuration
- Deployment automation
- Monitoring setup
- Supports: Docker, Kubernetes, AWS, GitLab CI
- API documentation
- User guides
- Release notes
- Operations manuals
- Formats: Markdown, Confluence, OpenAPI
# Run all tests
python -m pytest tests/ -v
# Run specific test
python -m pytest tests/test_agent.py
# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html
# Run individual test files
python tests/simple_test.py
python tests/test_file_writer.pyTest Coverage:
- Agent functionality and workflows
- File writer and format handling
- Response parsing and extraction
- Edge cases and error handling
- Integration tests
llm-multi-agent-system/
βββ src/
β βββ agents/ # Agent implementations
β β βββ base_agent.py # Base agent class
β β βββ business_analyst.py
β β βββ developer.py
β β βββ qa_engineer.py
β β βββ devops_engineer.py
β β βββ technical_writer.py
β βββ orchestrator/ # Orchestration logic
β β βββ agent_orchestrator.py
β β βββ task_manager.py
β β βββ workflow_engine.py
β βββ config/ # Configuration
β β βββ settings.py
β βββ utils/ # Utilities
β βββ file_writer.py
βββ tests/ # Test suite
βββ examples/ # Example scripts
βββ docs/ # Documentation
βββ scripts/ # Utility scripts
βββ output/ # Generated outputs
βββ logs/ # Log files
βββ config.yaml # Main configuration
βββ .env.example # Environment template
βββ requirements.txt # Dependencies
βββ setup.py # Setup script
βββ main.py # Entry point
Comprehensive guides in the docs/ directory:
- START_HERE.md - β Complete setup and getting started guide
- LangGraph Integration - Advanced orchestration with parallel execution
- Quick Start Guide - Get started in 5 minutes
- Architecture - System design and components
- Agent Specifications - Detailed agent capabilities
- Local-Only Mode - Privacy-first local execution
- Deployment Guide - Production deployment
- API Reference - Programmatic usage
- Tech Stack - Technologies used
- Integrations - Third-party integrations
- Testing Guide - Testing documentation
- Troubleshooting - Common issues and solutions
- Contributing - Development guidelines
- Changelog - Version history
- All processing happens on your machine
- No data sent to external services
- No API keys required (except dummy for local server)
- No internet connection needed after model download
- Local-only llama-server binding (127.0.0.1)
- No external network calls
- Workspace isolation
- Secure file handling
# Local LLM Server (REQUIRED)
OPENAI_API_BASE=http://127.0.0.1:8080/v1
OPENAI_API_KEY=not-needed
OPENAI_API_MODEL=devstral
# Optional: Workspace override
WORKSPACE=/path/to/workspace# Workspace settings
workspace: "."
output_directory: "./output"
log_level: "INFO"
log_file: "logs/agent_system.log"
# Execution settings
llm_timeout: 300
max_concurrent_agents: 5
task_retry_attempts: 3
task_timeout: 600
# Agent-specific configurations
agents:
developer:
enabled: true
languages: [python, javascript, typescript]
qa_engineer:
enabled: true
test_frameworks: [pytest, jest, playwright]"OPENAI_API_BASE not configured"
echo "OPENAI_API_BASE=http://127.0.0.1:8080/v1" >> .env"Connection refused"
# Ensure your local LLM server is running on port 8080Task timeout
# Increase in config.yaml
llm_timeout: 600
task_timeout: 900For more solutions, see TROUBLESHOOTING.md.
- Rapid Prototyping - Quickly generate full-stack applications
- Code Generation - Automated implementation of features
- Documentation - Auto-generate comprehensive docs
- Testing - Create complete test suites
- Infrastructure - Generate IaC and deployment configs
- Analysis - Technical feasibility studies
- Learning - Study AI-generated implementations
- Python 3.12 (required)
- 16GB RAM
- 8-core CPU
- 50GB disk space
- Python 3.12 (required)
- 32GB+ RAM
- Apple Silicon (M1/M2/M3) or NVIDIA GPU
- 100GB+ disk space
- Initial Response: < 5 seconds
- Agent Execution: 10-60 seconds per agent
- Full Workflow: 5-30 minutes (complexity dependent)
- Model Loading: 30-60 seconds (first run)
Contributions are welcome! Please read CONTRIBUTING.md for guidelines.
# Clone repo
git clone https://github.com/yourusername/llm-multi-agent-system.git
cd llm-multi-agent-system
# Create virtual environment with Python 3.12
python3.12 -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
# Install dev dependencies (inside venv, use python/pip)
pip install -r requirements.txt
pip install -r requirements-dev.txt
# Run tests
pytest tests/ -v
# Format code
black src/ tests/
# Lint code
flake8 src/ tests/This project is licensed under the MIT License - see the LICENSE file for details.
- llama.cpp - Local LLM inference
- Devstral - Default coding model
- OpenAI - API compatibility standard
- FastAPI - (Example framework in generated code)
- Issues: GitHub Issues
- Documentation: docs/
- Examples: examples/
- Web UI for workflow management
- Additional agent types (Security, Data Engineer)
- Workflow visualization
- Integration with popular tools (Jira, Confluence)
- Multi-language support for prompts
- Workflow templates marketplace
- Real-time collaboration features
See CHANGELOG.md for detailed version history.
Built with β€οΈ for developers who value privacy and local control.
For questions, issues, or contributions, please visit our GitHub repository.