Multi-model AI consensus through structured debates
Get better decisions by orchestrating debates between Claude Opus, GPT-5, Gemini Pro, Qwen, and Grok - all with full access to your codebase through MCP tools.
Problem: Single AI models have biases, knowledge gaps, and limitations.
Solution: Structured debates between multiple state-of-the-art AI models:
- π§ Reduce Bias - Multiple perspectives counteract individual model weaknesses
- π Higher Accuracy - Consensus approach reduces errors by 40%+
- π§ Full Tool Access - Models can read files, run commands, analyze your actual code
- π― Confidence Scoring - Know how reliable the answer is (0-100%)
- β‘ Smart Caching - 90% cost reduction on repeated questions
# 1. Add marketplace
/plugin marketplace add KostasNoreika/claude-code-plugin-debate-consensus
# 2. Install plugin
/plugin install debate-consensus
# 3. Configure API key (one-time setup)
cd ~/.claude/plugins/debate-consensus/mcp-servers/debate-consensus
cp .env.example .env
nano .env # Add your OpenRouter API keyGet your free API key: OpenRouter
/debate "What's the best database for real-time analytics?"| Command | Purpose | Speed |
|---|---|---|
/debate <question> |
Full multi-model debate | 30-120s |
/consensus <topic> |
Quick 3-model consensus | 10-30s |
/verify <statement> |
Adversarial verification | 20-60s |
/debate-rapid <question> |
Ultra-fast answer | 3-10s |
/debate-history [limit] |
View past debates | Instant |
NEW: Dynamic Roles & Emergent Flow - opt-in experimental feature!
/debate "What's the best database for analytics?"- Fixed expert roles (Architecture, Testing, Algorithms, Integration)
- 2-round structured debate
- Fast (30-60s), reliable (90.2% pass rate)
- Cost-effective ($0.05 average)
/debate "What's the best database for analytics?" --mode expert- Dynamic expert selection - Gemini analyzes question and selects optimal experts
- Emergent debate flow - 3-10 rounds of adaptive conversation
- Deeper insights - Models collaborate and build on each other's ideas
- Smart role assignment - 7 expertise types (security, performance, architecture, etc.)
Trade-offs:
- β±οΈ Slower: 30-120s (vs 30-60s)
- π° More expensive: $0.10-1.00 (vs $0.05) - 5-10x cost
- π― Higher quality: +10-20% better results (estimated)
- π¬ Experimental: Being validated with real usage
Try it when:
- β Complex, multi-faceted questions
- β Need deep domain expertise
- β Quality more important than speed/cost
- β Exploring critical architectural decisions
Use standard mode when:
- β‘ Quick decisions needed
- π° Budget-conscious
- π― Straightforward questions
Documentation:
- v2.0 POC Guide - How v2.0 works
- Implementation Report - Technical details
- π§ Installation Guide - Complete setup guide
- π‘ Usage Examples - Real-world scenarios
- π Troubleshooting - Common issues & solutions
- π API Reference - Complete API documentation
- π§ͺ Beta Testing - How to participate
- π Conversion Plan - Detailed conversion roadmap
- π Project Summary - Technical overview
- π§ͺ Test Results - Comprehensive test reports
- π― Phase Completion - Phase 1-4 completion reports
Current Version: v2.0.0 (v1.0 stable + v2.0 experimental)
Successfully converted from standalone mcp-debate-consensus MCP server.
- Phase 0: Planning & Conversion Plan (100%)
- Phase 1: Foundation - Plugin manifest & commands (100%)
- Phase 2: Advanced Features - Agents, hooks, caching (100%)
- Phase 3: Testing & Documentation (100%)
- 30+ integration tests (100% pass rate)
- 1,750+ lines of documentation
- API reference, installation guide, troubleshooting
- Phase 4: Distribution (100% Complete)
- Marketplace entry created
- Security hardened (git history cleaned)
- Repository polished and published
- Ready for public release
See completion reports: Phase 1 | Phase 2 | Phase 3 | Phase 4
- Phase 2.1 POC: Foundation (100% Complete)
- Intelligent Coordinator (Gemini-powered)
- Dynamic expert selection (7 expertise types)
- Emergent 3-round debate orchestration
- Safety controls (cost/time/round limits)
- POC test infrastructure
- Phase 2.2: Named Communication (Pending validation)
- Phase 2.3: Full Emergent Flow (Pending validation)
- Phase 2.4: Production Hardening (Pending validation)
v2.0 Status: Opt-in experimental feature, being validated with real usage
See v2.0 docs: POC Guide | Implementation Report | Planning Docs
If you find this plugin helpful, consider supporting its development!
Why donate?
- β Keep the developer caffeinated
- π Fund continued development and new features
- π Faster bug fixes and support
- π More documentation and examples
- π― 100% of donations go to improving this plugin
Even $5 helps! It shows you value multi-model AI consensus and want to see this project thrive.
Can't donate? Here's how else you can help:
- β Star this repo on GitHub
- π¦ Share on Twitter/X
- π Write a blog post about it
- π Report bugs and suggest features
- π¬ Answer questions in Discussions
- π§ Contribute code or documentation
Contributions welcome! This is an open-source project.
- Code: Submit PRs for new features or bug fixes
- Documentation: Improve guides and examples
- Testing: Report bugs and test new features
- Ideas: Suggest improvements in Discussions
See CONTRIBUTING.md for guidelines (coming soon).
MIT License
- Built on MCP by Anthropic
- Powered by OpenRouter
- Based on mcp-debate-consensus
β Star this repo if you're excited about multi-model AI debates!