Automatically extract meaningful lecture slides from YouTube videos using AI and generate a clean PDF.
Save time, skip screenshots, and get organized notes with a single click.
- 🎥 Download lectures directly from YouTube
- 🧠 Detect and extract unique slides using deep frame comparison
- 🔍 Integrated OCR for reading slide content
- 📄 Auto-generate a clean PDF of extracted slides
- 🖼️ GUI built with Tkinter
- 📊 Progress bar with real-time updates
- 🧵 Threaded download & extraction (non-blocking)
LectureCapture/
│
├── slide_extractor.py # Core logic for detecting unique slides
├── app.py # Tkinter GUI interface
├── requirements.txt # All dependencies
├── output/ # Folder for storing extracted slides and final PDF
└── README.md
-
Clone or download the repository to your local machine:
git clone https://github.com/divA2805/LectureCapture.git cd LectureCapture
-
Make sure the
slide_extractor.py
file is in the same directory asapp.py
or correctly referenced.
Make sure you have Python ≥ 3.8 installed.
pip install -r requirements.txt
Also, install Tesseract OCR:
- Windows: Tesseract Installer
- Ubuntu:
sudo apt install tesseract-ocr
python app.py
A GUI will open where you can:
- Paste the YouTube link
- Choose a download location
- Click "Start" to begin download, slide extraction, and PDF generation
- The video is downloaded using
yt-dlp
- Frames are sampled at intervals
- Similar or duplicate frames are removed using OpenCV comparison
- OCR reads any visible slide text (optional, for enhancements)
- Final slides are compiled into a professional PDF using ReportLab
✅ Clean slides PDF from a 1-hour technical lecture in under 2 minutes
📁 Slides saved inoutput/
with timestamped filenames
🧾 Final PDF available for download
Made with ❤️