β¨ Never Build Slides from Scratch Again β¨
| π Universal File Support Β |Β π― RAG-Powered Precision Β |Β π¨ Custom Styling Β |Β β‘ Lightning Speed |
Turns your research papers, reports, and documents into professional slides & posters in minutes.
-
π Universal Document Support
Seamlessly process PDF, Word, Excel, PowerPoint, Markdown, and multiple file formats simultaneously. -
π― Comprehensive Content Extraction
RAG-powered mechanism ensures every critical insight, figure, and data point is captured with precision. -
π Source-Linked Accuracy
Maintains direct traceability between generated content and original sources, eliminating information drift. -
π¨ Custom Styling Freedom
Choose from professional built-in themes or describe your vision in natural language for custom styling. -
β‘ Lightning-Fast Generation
Instant preview mode enables rapid experimentation and real-time refinements. -
πΎ Seamless Session Management
Advanced checkpoint system preserves all progressβpause, resume, or switch themes instantly without loss. -
β¨ Professional-Grade Visuals
Deliver polished, presentation-ready slides and posters with publication-quality design standards.
# One command to generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --style doraemon --length medium --fast --parallel 2- [2025.12.09] Added parallel slide generation (
--parallel) for faster processing - [2025.12.08] Paper2Slides is now open source!
![]() doraemon |
![]() academic |
![]() custom |
![]() doraemon |
![]() academic |
![]() custom |
β¨ Multiple styles available β simply modify the --style parameter
Examples from DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
π‘ Custom Style Example: Totoro Theme
--style "Studio Ghibli anime style with warm whimsical aesthetic. Use soft watercolor Morandi tones with light cream background, muted sage green and dusty pink accents. Totoro character can appear as a friendly guide relating to the content, with nature elements like soft clouds or leaves."
# Clone repository
git clone https://github.com/HKUDS/Paper2Slides.git
cd Paper2Slides
# Create and activate conda environment
conda create -n paper2slides python=3.12 -y
conda activate paper2slides
# Install dependencies
pip install -r requirements.txtNote
Create a .env file in paper2slides/ directory with your API keys. Refer to paper2slides/.env.example for the required variables.
# Basic usage - generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --length medium
# Generate poster with custom style
python -m paper2slides --input paper.pdf --output poster --style "minimalist with blue theme" --density medium
# Fast mode
python -m paper2slides --input paper.pdf --output slides --fast
# Enable parallel generation (2 workers by default)
python -m paper2slides --input paper.pdf --output slides --parallel 2
# List all processed outputs
python -m paper2slides --listCLI Options:
| Option | Description | Default |
|---|---|---|
--input, -i |
Input file(s) or directory | Required |
--output |
Output type: slides or poster |
poster |
--content |
Content type: paper or general |
paper |
--style |
Style: academic, doraemon, or custom |
doraemon |
--length |
Slides length: short, medium, long |
short |
--density |
Poster density: sparse, medium, dense |
medium |
--fast |
Fast mode: skip RAG indexing | false |
--parallel |
Enable parallel slide generation: --parallel uses 2 workers, --parallel N uses N workers |
1 (sequential without this option) |
--from-stage |
Force restart from stage: rag, summary, plan, generate |
Auto-detect |
--debug |
Enable debug logging | false |
πΎ Checkpoint & Resume:
Paper2Slides intelligently saves your progress at every key stage, allowing you to:
| Scenario | Command |
|---|---|
| Resume after interruption | Just run the same command again β it auto-detects and continues |
| Change style only | Add --from-stage plan to skip re-parsing |
| Regenerate images | Add --from-stage generate to keep the same plan |
| Full restart | Add --from-stage rag to start from scratch |
Tip
Checkpoints are auto-saved. Just run the same command to resume. Use --from-stage only to force restart from a specific stage.
Launch both backend and frontend services:
./scripts/start.shOr start services independently:
# Terminal 1: Start backend API
./scripts/start_backend.sh
# Terminal 2: Start frontend
./scripts/start_frontend.shAccess the web interface at http://localhost:5173 (default)
Paper2Slides transforms documents through a 4-stage pipeline designed for reliability and efficiency:
| Stage | Description | Checkpoint | Output |
|---|---|---|---|
| π RAG | Parse documents and construct intelligent retrieval index using RAG | checkpoint_rag.json |
Searchable knowledge base |
| π Analysis | Extract document structure, identify key figures, tables, and content hierarchy | checkpoint_summary.json |
Structured content map |
| π Planning | Generate optimized content layout and slide/poster organization strategy | checkpoint_plan.json |
Presentation blueprint |
| π¨ Creation | Render final high-quality slides and poster visuals | Output directory | Polished presentation materials |
Each stage automatically saves progress checkpoints, enabling seamless resumption from any point if the process is interruptedβno need to start over.
| Mode | Processing Pipeline | Use Cases |
|---|---|---|
| Normal | Complete RAG indexing with deep document analysis | Complex research papers, lengthy documents, multi-section content |
| Fast | Skip RAG indexing, direct LLM query | Short documents, instant previews, quick revisions |
Use --fast when:
- Document (text + figures) is short enough to fit in LLM context
- Quick preview/iteration needed
- Don't want to wait for RAG indexing
Use normal mode (default) when:
- Document is long or has many figures
- Multiple files to process together
- Need retrieval for better context selection
outputs/
βββ <project_name>/
β βββ <content_type>/ # paper or general
β β βββ <mode>/ # fast or normal
β β β βββ checkpoint_rag.json # RAG query results & parsed file paths
β β β βββ checkpoint_summary.json # Extracted content, figures, tables
β β β βββ summary.md # Human-readable summary
β β β βββ <config_name>/ # e.g., slides_doraemon_medium
β β β βββ state.json # Current pipeline state
β β β βββ checkpoint_plan.json # Content plan for slides/poster
β β β βββ <timestamp>/ # Generated outputs
β β β βββ slide_01.png
β β β βββ slide_02.png
β β β βββ ...
β β β βββ slides.pdf # Final PDF output
β β βββ rag_output/ # RAG index storage
β βββ ...
βββ ...
Checkpoint Files:
| File | Description | Reusable When |
|---|---|---|
checkpoint_rag.json |
Parsed document content | Same input files |
checkpoint_summary.json |
Figures, tables, structure | Same input files |
checkpoint_plan.json |
Content layout plan | Same style & length/density |
| Style | Description |
|---|---|
academic |
Clean, professional academic presentation style |
doraemon |
Colorful, friendly style with illustrations |
custom |
Any text description for LLM-generated style |
Tip
Paper2Slides uses gemini-3-pro-image-preview (Nano Banana Pro Preview) for image generation. Key findings:
- Mood Keywords: Words like "warm", "elegant", "vibrant" strongly influence the overall color palette
- Layout vs Style: Fine-grained layout instructions ground well; fine-grained element styling does not
- Prompt Length: Simple prompts generally outperform detailed ones
- Multi-slide Generation: Native multi-image output is story-like; for consistent slides, we use iterative single-image generation
| Module | Description |
|---|---|
paper2slides/core/ |
Pipeline orchestration, 4-stage execution |
paper2slides/raganything/ |
Document parsing & RAG indexing |
paper2slides/summary/ |
Content extraction: figures, tables, paper structure |
paper2slides/generator/ |
Content planning & image generation |
api/ |
FastAPI backend for web interface |
frontend/ |
React frontend (Vite + TailwindCSS) |
Click to expand full project structure
Paper2Slides/
βββ paper2slides/ # Core library
β βββ main.py # CLI entry point
β βββ core/
β β βββ pipeline.py # Main pipeline orchestration
β β βββ state.py # Checkpoint state management
β β βββ stages/
β β βββ rag_stage.py # Stage 1: Parse & index
β β βββ summary_stage.py # Stage 2: Extract content
β β βββ plan_stage.py # Stage 3: Plan layout
β β βββ generate_stage.py # Stage 4: Generate images
β β
β βββ raganything/
β β βββ raganything.py # RAG processor
β β βββ parser.py # Document parser
β β
β βββ summary/
β β βββ paper.py # Paper structure extraction
β β βββ extractors/ # Figure/table extractors
β β
β βββ generator/
β β βββ content_planner.py # Slide/poster planning
β β βββ image_generator.py # Image generation
β β
β βββ prompts/ # LLM prompt templates
β βββ utils/ # Utilities
β
βββ api/server.py # FastAPI backend
βββ frontend/src/ # React frontend
βββ scripts/ # Shell scripts (start/stop)
- LightRAG: Graph-Empowered RAG
- RAG-Anything: Multi-Modal RAG
- VideoRAG: RAG with Extremely-Long Videos
πFound Paper2Slides helpful? Star us on GitHub!
π Turn any document into professional presentations in minutes!








