Multi-Agent Research Paper to Codebase Pipeline

Upload any research paper (PDF or arXiv ID) → AI agents analyze it → Generate a complete, runnable Python codebase → Download as ZIP

Transform arXiv papers into production-ready projects using a multi-agent workflow Analyst → Architect → Coder.

Reads the paper using RAG (FAISS vector search over paper chunks)
Analyses it, extracts components, equations, data flow, inputs/outputs
Plans a full Python project, files, classes, functions, algorithm steps
Codes every file — using the most relevant paper sections as context per file
Packages everything — folder structure, requirements.txt, README.md, downloadable ZIP

Generated Project Example

For "Attention Is All You Need" paper (arXiv:1706.03762):

output/projects/transformer-attention/
├── src/
│   ├── encoder.py         # MultiHeadAttention + FeedForward
│   ├── decoder.py         # DecoderLayer with encoder-decoder attention
│   ├── attention.py       # ScaledDotProductAttention
│   ├── model.py           # Transformer
│   └── main.py            # End-to-end demo
├── requirements.txt       # torch, numpy
└── README.md              # How to train/extend

Features

Feature	Description
Paper Ingestion	Upload local PDF or enter arXiv ID automatic text extraction and chunking
FAISS RAG	Vector index over paper chunks per-agent, per-file context retrieval
Multi-Agent Pipeline	Analyst → Architect → Coder three specialized agents, each with targeted prompts
Streamlit UI	Step-by-step flow: upload → review plan → generate code → download ZIP
Plan Review	Inspect every file, class, function, and algorithm step before generating code
Project Export	Full folder structure with `src/`, `requirements.txt`, `README.md` per project
Syntax Validation	`ast.parse()` on every generated file, broken files flagged in UI
Post-Processing	Auto-fix `main_file` prefix, auto-collect dependencies, phantom class removal
Debug Logs	Raw agent outputs saved per run for prompt tuning and debugging

🚀 Quick Start

1. Clone & Install

git clone https://github.com/Ravevx/multi-agent-paper-pipeline.git
cd multi-agent-paper-pipeline

python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

2. Configure LLM

Local LM Studio (default):

# Start LM Studio server at http://127.0.0.1:1234/v1
# Set your model in config.py
LMSTUDIO_URL=http://127.0.0.1:1234/v1
LMSTUDIO_MODEL=your-model-name

3. Run

streamlit run app.py

Usage Flow

1. Upload PDF or enter arXiv ID
2. [Analyse] → Build RAG → Analyst → Architect → Plan
3. Review plan (files, classes, functions, steps)
4. [Approve] → Generate code file-by-file
5. Download ZIP or use local copy

Agent Roles

Agent	Input	Output	Key Prompt Rule
Analyst	Paper chunks (RAG)	5-section technical analysis	Start with `## 1.` no preamble, no hedging
Architect	Analysis text	ProjectPlan JSON	Class names must cross-reference, no phantom classes
Coder	File spec + RAG context	Python source file	EXACT NAMES box at top, enforces plan class names

Key Design Decisions

RAG per file: Each file gets its own targeted RAG query using its class names and logic summary, so attention.py retrieves attention equation chunks, not encoder chunks
Plan validation: _validate_plan() removes phantom classes (classes listed in main.py but not defined anywhere), preventing ImportError at runtime
nn.Module enforcement: fix_nn_module() detects torch files and adds (torch.nn.Module) inheritance where missing
No architect truncation fallback: If main.py algorithm steps are empty (truncated output), crew_runner.py rebuilds them from the actual class/function list

Project Structure

.
├── app.py                 # Streamlit app
├── crew_runner.py         # Main orchestration
├── crew_tasks.py          # Agent prompts
├── crew_agents.py         # Agent definitions
├── rag_store.py           # FAISS RAG
├── paper_tools.py         # PDF/arXiv extraction
├── project_planner.py     # Data models
├── config.py              # Settings
├── requirements.txt       # Dependencies
├── output/
│   ├── papers/            # Cached PDFs
│   └── projects/          # Generated codebases
│       └── <project_name>/
│           ├── src/
│           ├── requirements.txt
│           └── README.md
└── screenshots/           # UI screenshots

Performance

Paper	Files	Time	Syntax Errors
Attention Is All You Need	7	12 min	0
BERT	12	18 min	1
GPT-2	9	14 min	0

Project Architecture

                               
                           ┌───────────────────┐
                           │   RAG Index       │
                           └────────┬──────────┘
                                    │
                           ┌───────────────────┐
                           │Angent (Analyst)   │
                           │  (Paper Analysis) │
                           └────────┬──────────┘
                                    │
                           ┌───────────────────┐
                           │Angent (Architect) │
                           │  (Project Plan)   │─────┐
                           └────────┬──────────┘     |
                                    │                | 
                           ┌───────────────────┐     |
                           │ Human-in-the-loop │     |
                           │  Approve / Edit   │     |
                           └───────┬───────────┘     |
                                   │  Approved? No   |
                  ┌────────────────┴─────────────────┘
                  │                                  
           Approved? Yes                       
                  │                                                        
          ┌─────────────┐ 
          │Agent (Coder)│                  
          │ Generate    │                  
          │  Files      │                 
          └─────┬───────┘                 
                │                                 
                └───────────┐                                   
                   ┌─────────────────────────────┐
                   │ Generate Project Folder /    │
                   │ src/, requirements.txt,     │
                   │ README.md, ZIP Export       │
                   └─────────────────────────────┘

Contributing

Contributions are very welcome. Good areas to work on:

Better prompts for specific domains (computer vision, reinforcement learning, graph networks)
Alternative RAG backends (Chroma, Pinecone, Weaviate)
Support for LaTeX source papers not on arXiv PDF
Training loop generation extend Coder to also produce train scripts
Unit tests for crew_runner.py post-processing functions

License

MIT License LICENSE

##Acknowledgments

CrewAI — multi-agent orchestration
FAISS — vector search
Streamlit — amazing UI framework
Research community — for the papers that power this!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Research Paper to Codebase Pipeline

Table of Contents

What It Does

Generated Project Example

Features

🚀 Quick Start

1. Clone & Install

2. Configure LLM

3. Run

Usage Flow

Agent Roles

Key Design Decisions

Project Structure

Performance

Project Architecture

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
output/projects		output/projects
temp		temp
README.md		README.md
agent_logger.py		agent_logger.py
api.py		api.py
app.py		app.py
config.py		config.py
crew_agents.py		crew_agents.py
crew_runner.py		crew_runner.py
crew_tasks.py		crew_tasks.py
llm.py		llm.py
paper_tools.py		paper_tools.py
project_planner.py		project_planner.py
rag_store.py		rag_store.py

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Research Paper to Codebase Pipeline

Table of Contents

What It Does

Generated Project Example

Features

🚀 Quick Start

1. Clone & Install

2. Configure LLM

3. Run

Usage Flow

Agent Roles

Key Design Decisions

Project Structure

Performance

Project Architecture

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages