Gemma 3 PDF Summarizer

A powerful PDF summarization tool that uses Google's Gemma 3 (via Ollama) to generate comprehensive technical summaries of academic papers, with a focus on extracting and organizing technical details.

Features

📚 Arxiv PDF download and processing
🔍 Intelligent text extraction and chunking
🤖 Parallel processing with Gemma 3 LLM
📊 Structured technical summaries
⚡ FastAPI backend with async processing
🌐 Streamlit frontend interface

System Requirements

Python 3.8+
Ollama with Gemma 3 model installed
16GB+ RAM recommended
GPU recommended for faster processing

Installation

Clone the repository:

git clone https://github.com/arjunprabhulal/gemma3_pdf_summarizer.git
cd gemma3_pdf_summarizer

Install dependencies:

pip install -r requirement.txt

Install and run Ollama with Gemma 3:

# Install Ollama from https://ollama.ai
ollama pull gemma3:27b

Usage

Start the FastAPI backend:

python main.py

Start the Streamlit frontend in a new terminal:

streamlit run frontend.py

Access the web interface at http://localhost:8501

Architecture

The system uses a parallel processing architecture to handle large PDFs efficiently:

PDF Processing: Downloads and extracts text from PDFs using PyMuPDF
Text Chunking: Splits text into optimal chunks for Gemma 3's context window
Parallel Processing: Processes chunks concurrently with retry mechanisms
Summary Generation: Creates structured technical summaries focusing on:
- System Architecture
- Technical Implementation
- Infrastructure & Setup
- Performance Analysis
- Optimization Techniques

API Endpoints

GET /health: Health check endpoint
POST /summarize_arxiv/: Main endpoint for PDF summarization
- Input: {"url": "https://arxiv.org/pdf/paper_id.pdf"}
- Output: Structured technical summary

Contributing

Feel free to open issues or submit pull requests for improvements.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
PDF_Summarizer.png		PDF_Summarizer.png
README.md		README.md
frontend.py		frontend.py
main.py		main.py
requirement.txt		requirement.txt
review.py		review.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemma 3 PDF Summarizer

Features

System Requirements

Installation

Usage

Architecture

API Endpoints

Contributing

License

About

Releases

Packages

Languages

arjunprabhulal/gemma3_pdf_summarizer

Folders and files

Latest commit

History

Repository files navigation

Gemma 3 PDF Summarizer

Features

System Requirements

Installation

Usage

Architecture

API Endpoints

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages