Chain of Agents (CoA) implementation in Python and Swift - A framework for long-context tasks using Large Language Models (LLMs).
This repository implements the Chain-of-Agents framework as described in:
- Chain of Agents: Large language models collaborating on long-context tasks (Google Research Blog)
- Chain of Agents Paper (Research Paper)
The Chain of Agents framework enables efficient processing of long-context tasks by:
- Breaking down large inputs into manageable chunks
- Using worker agents to process individual chunks
- Employing a manager agent to synthesize results
- Support for PDF document analysis
- Configurable chunk sizes for processing
- Real-time progress tracking
- Streaming responses from both worker and manager agents
- Clean macOS native interface
- Dual processing modes:
- Cloud-based processing using Together AI's LLaMA models
- On-device processing using MLX framework
- Support for offline inference with MLX
git clone https://github.com/rudrankriyam/chain-of-agents.git
cd chain-of-agents
pip install -r requirements.txt
- Python 3.x
- Xcode 15+ (for macOS app)
- Together API key (sign up at Together.ai)
- Create a
.env
file in the root directory and add your Together API key:
echo "TOGETHER_API_KEY=your_api_key_here" >> .env
- Run the example script:
./run.sh
This will:
- Set up a Python virtual environment
- Install required dependencies
- Process a sample PDF document using the Chain of Agents framework
- Start the API server:
./run_api.sh
This will:
- Activate the Python virtual environment
- Start the API server that the macOS app communicates with
- The server will run on
localhost:8000
- Open the Xcode project:
open ChainOfAgents.xcodeproj
- Build and run the app (⌘R)
- Cloud Processing: Uses Together AI's hosted models through the API server
- On-Device Processing: Uses MLX framework for local inference
- Automatically downloads and caches the required model
- No internet connection required after initial model download
- Lower latency but may have different performance characteristics
The macOS app provides:
- PDF document selection
- Custom query input
- Real-time processing visualization
- Worker agent progress tracking
- Final synthesis display
- Toggle between cloud and on-device processing
Uses Together AI's hosted models:
- Worker model:
meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
- Manager model:
meta-llama/Llama-3.3-70B-Instruct-Turbo-Free
Uses MLX-optimized models:
- Default model:
llama-3.1-8B
(Quantized 8-bit version) - Automatically handles model downloading and caching
- Optimized for Apple Silicon processors
- Built with SwiftUI for the macOS interface
- Uses MLX framework for efficient on-device inference
- Implements Server-Sent Events (SSE) for real-time progress updates
- Supports concurrent processing of document chunks
- Automatic memory management for large documents
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
@article{zhang2024chain,
title={Chain of Agents: Large Language Models Collaborating on Long-Context Tasks},
author={Zhang, Yusen and Sun, Ruoxi and Chen, Yanfei and Pfister, Tomas and Zhang, Rui and Arık, Sercan Ö.},
journal={arXiv preprint arXiv:2404.08392},
year={2024}
}