Skip to content

HKUDS/Paper2Slides

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Paper2Slides Logo

Paper2Slides: From Paper to Presentation in One Click

Python License Feishu WeChat

✨ Never Build Slides from Scratch Again ✨

| πŸ“„ Universal File Support Β |Β  🎯 RAG-Powered Precision Β |Β  🎨 Custom Styling Β |Β  ⚑ Lightning Speed |


🎯 What is Paper2Slides?

Turns your research papers, reports, and documents into professional slides & posters in minutes.

✨ Key Features

  • πŸ“„ Universal Document Support
    Seamlessly process PDF, Word, Excel, PowerPoint, Markdown, and multiple file formats simultaneously.

  • 🎯 Comprehensive Content Extraction
    RAG-powered mechanism ensures every critical insight, figure, and data point is captured with precision.

  • πŸ”— Source-Linked Accuracy
    Maintains direct traceability between generated content and original sources, eliminating information drift.

  • 🎨 Custom Styling Freedom
    Choose from professional built-in themes or describe your vision in natural language for custom styling.

  • ⚑ Lightning-Fast Generation
    Instant preview mode enables rapid experimentation and real-time refinements.

  • πŸ’Ύ Seamless Session Management
    Advanced checkpoint system preserves all progressβ€”pause, resume, or switch themes instantly without loss.

  • ✨ Professional-Grade Visuals
    Deliver polished, presentation-ready slides and posters with publication-quality design standards.

⚑ Easy as One Command

# One command to generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --style doraemon --length medium --fast --parallel 2

πŸ”₯ News

  • [2025.12.09] Added parallel slide generation (--parallel) for faster processing
  • [2025.12.08] Paper2Slides is now open source!

🎨 Custom Styling Showcase


doraemon

academic

custom

doraemon

academic

custom

✨ Multiple styles available β€” simply modify the --style parameter
Examples from DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

πŸ’‘ Custom Style Example: Totoro Theme
--style "Studio Ghibli anime style with warm whimsical aesthetic. Use soft watercolor Morandi tones with light cream background, muted sage green and dusty pink accents. Totoro character can appear as a friendly guide relating to the content, with nature elements like soft clouds or leaves."

🌐 Paper2Slides Web Interface


πŸ“‹ Table of Contents


πŸƒ Quick Start

1. Environment Setup

# Clone repository
git clone https://github.com/HKUDS/Paper2Slides.git
cd Paper2Slides

# Create and activate conda environment
conda create -n paper2slides python=3.12 -y
conda activate paper2slides

# Install dependencies
pip install -r requirements.txt

Note

Create a .env file in paper2slides/ directory with your API keys. Refer to paper2slides/.env.example for the required variables.

2. Command Line Usage

# Basic usage - generate slides from a paper
python -m paper2slides --input paper.pdf --output slides --length medium

# Generate poster with custom style
python -m paper2slides --input paper.pdf --output poster --style "minimalist with blue theme" --density medium

# Fast mode
python -m paper2slides --input paper.pdf --output slides --fast

# Enable parallel generation (2 workers by default)
python -m paper2slides --input paper.pdf --output slides --parallel 2

# List all processed outputs
python -m paper2slides --list

CLI Options:

Option Description Default
--input, -i Input file(s) or directory Required
--output Output type: slides or poster poster
--content Content type: paper or general paper
--style Style: academic, doraemon, or custom doraemon
--length Slides length: short, medium, long short
--density Poster density: sparse, medium, dense medium
--fast Fast mode: skip RAG indexing false
--parallel Enable parallel slide generation: --parallel uses 2 workers, --parallel N uses N workers 1 (sequential without this option)
--from-stage Force restart from stage: rag, summary, plan, generate Auto-detect
--debug Enable debug logging false

πŸ’Ύ Checkpoint & Resume:

Paper2Slides intelligently saves your progress at every key stage, allowing you to:

Scenario Command
Resume after interruption Just run the same command again β€” it auto-detects and continues
Change style only Add --from-stage plan to skip re-parsing
Regenerate images Add --from-stage generate to keep the same plan
Full restart Add --from-stage rag to start from scratch

Tip

Checkpoints are auto-saved. Just run the same command to resume. Use --from-stage only to force restart from a specific stage.

3. Web Interface

Launch both backend and frontend services:

./scripts/start.sh

Or start services independently:

# Terminal 1: Start backend API
./scripts/start_backend.sh

# Terminal 2: Start frontend
./scripts/start_frontend.sh

Access the web interface at http://localhost:5173 (default)


πŸ—οΈ Paper2Slides Framework

Paper2Slides transforms documents through a 4-stage pipeline designed for reliability and efficiency:

Stage Description Checkpoint Output
πŸ” RAG Parse documents and construct intelligent retrieval index using RAG checkpoint_rag.json Searchable knowledge base
πŸ“Š Analysis Extract document structure, identify key figures, tables, and content hierarchy checkpoint_summary.json Structured content map
πŸ“‹ Planning Generate optimized content layout and slide/poster organization strategy checkpoint_plan.json Presentation blueprint
🎨 Creation Render final high-quality slides and poster visuals Output directory Polished presentation materials

πŸ’Ύ Smart Recovery System

Each stage automatically saves progress checkpoints, enabling seamless resumption from any point if the process is interruptedβ€”no need to start over.

Fast Mode vs Normal Mode

Mode Processing Pipeline Use Cases
Normal Complete RAG indexing with deep document analysis Complex research papers, lengthy documents, multi-section content
Fast Skip RAG indexing, direct LLM query Short documents, instant previews, quick revisions

Use --fast when:

  • Document (text + figures) is short enough to fit in LLM context
  • Quick preview/iteration needed
  • Don't want to wait for RAG indexing

Use normal mode (default) when:

  • Document is long or has many figures
  • Multiple files to process together
  • Need retrieval for better context selection

βš™οΈ Configuration

Output Directory Structure

outputs/
β”œβ”€β”€ <project_name>/
β”‚   β”œβ”€β”€ <content_type>/                   # paper or general
β”‚   β”‚   β”œβ”€β”€ <mode>/                       # fast or normal
β”‚   β”‚   β”‚   β”œβ”€β”€ checkpoint_rag.json       # RAG query results & parsed file paths
β”‚   β”‚   β”‚   β”œβ”€β”€ checkpoint_summary.json   # Extracted content, figures, tables
β”‚   β”‚   β”‚   β”œβ”€β”€ summary.md                # Human-readable summary
β”‚   β”‚   β”‚   └── <config_name>/            # e.g., slides_doraemon_medium
β”‚   β”‚   β”‚       β”œβ”€β”€ state.json            # Current pipeline state
β”‚   β”‚   β”‚       β”œβ”€β”€ checkpoint_plan.json  # Content plan for slides/poster
β”‚   β”‚   β”‚       └── <timestamp>/          # Generated outputs
β”‚   β”‚   β”‚           β”œβ”€β”€ slide_01.png
β”‚   β”‚   β”‚           β”œβ”€β”€ slide_02.png
β”‚   β”‚   β”‚           β”œβ”€β”€ ...
β”‚   β”‚   β”‚           └── slides.pdf        # Final PDF output
β”‚   β”‚   └── rag_output/                   # RAG index storage
β”‚   └── ...
└── ...

Checkpoint Files:

File Description Reusable When
checkpoint_rag.json Parsed document content Same input files
checkpoint_summary.json Figures, tables, structure Same input files
checkpoint_plan.json Content layout plan Same style & length/density

Style Configuration

Style Description
academic Clean, professional academic presentation style
doraemon Colorful, friendly style with illustrations
custom Any text description for LLM-generated style

Image Generation Notes

Tip

Paper2Slides uses gemini-3-pro-image-preview (Nano Banana Pro Preview) for image generation. Key findings:

  • Mood Keywords: Words like "warm", "elegant", "vibrant" strongly influence the overall color palette
  • Layout vs Style: Fine-grained layout instructions ground well; fine-grained element styling does not
  • Prompt Length: Simple prompts generally outperform detailed ones
  • Multi-slide Generation: Native multi-image output is story-like; for consistent slides, we use iterative single-image generation

πŸ“ Code Structure

Module Description
paper2slides/core/ Pipeline orchestration, 4-stage execution
paper2slides/raganything/ Document parsing & RAG indexing
paper2slides/summary/ Content extraction: figures, tables, paper structure
paper2slides/generator/ Content planning & image generation
api/ FastAPI backend for web interface
frontend/ React frontend (Vite + TailwindCSS)
Click to expand full project structure
Paper2Slides/
β”œβ”€β”€ paper2slides/                 # Core library
β”‚   β”œβ”€β”€ main.py                   # CLI entry point
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ pipeline.py           # Main pipeline orchestration
β”‚   β”‚   β”œβ”€β”€ state.py              # Checkpoint state management
β”‚   β”‚   └── stages/
β”‚   β”‚       β”œβ”€β”€ rag_stage.py      # Stage 1: Parse & index
β”‚   β”‚       β”œβ”€β”€ summary_stage.py  # Stage 2: Extract content
β”‚   β”‚       β”œβ”€β”€ plan_stage.py     # Stage 3: Plan layout
β”‚   β”‚       └── generate_stage.py # Stage 4: Generate images
β”‚   β”‚
β”‚   β”œβ”€β”€ raganything/
β”‚   β”‚   β”œβ”€β”€ raganything.py        # RAG processor
β”‚   β”‚   └── parser.py             # Document parser
β”‚   β”‚
β”‚   β”œβ”€β”€ summary/
β”‚   β”‚   β”œβ”€β”€ paper.py              # Paper structure extraction
β”‚   β”‚   └── extractors/           # Figure/table extractors
β”‚   β”‚
β”‚   β”œβ”€β”€ generator/
β”‚   β”‚   β”œβ”€β”€ content_planner.py    # Slide/poster planning
β”‚   β”‚   └── image_generator.py    # Image generation
β”‚   β”‚
β”‚   β”œβ”€β”€ prompts/                  # LLM prompt templates
β”‚   └── utils/                    # Utilities
β”‚
β”œβ”€β”€ api/server.py                 # FastAPI backend
β”œβ”€β”€ frontend/src/                 # React frontend
└── scripts/                      # Shell scripts (start/stop)

πŸ™ Related Open-Sourced Projects


🌟Found Paper2Slides helpful? Star us on GitHub!

πŸš€ Turn any document into professional presentations in minutes!


❀️ Thanks for visiting ✨ Paper2Slides!

Views

About

"Paper2Slides: From Paper to Presentation in One Click"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published