- Overview
- Features
- Requirements
- Installation
- Usage
- Keyboard Shortcuts
- Slide Types Recognized
- Adaptive Screen Resolution Support
- Configuration Options
- Troubleshooting
- Version History
- License
Enhanced Universal Slide Capture & OCR Renamer is a powerful macOS AppleScript utility designed to automate the process of capturing presentation slides and intelligently renaming them based on content analysis. The tool is particularly useful for educators, trainers, and students who need to save web-based presentations (like Google Slides, PowerPoint Online, etc.) with meaningful filenames that reflect the slide content.
This script uses Optical Character Recognition (OCR) to extract text from captured slides, analyzes the content to determine slide type and subject, and applies a consistent naming convention that makes organizing and finding slides easier.
- Multi-browser support: Works with Chrome, Safari, Firefox, and Microsoft Edge
- Intelligent slide classification: Automatically detects different slide types
- Adaptive geometry scaling: Adjusts to different screen resolutions
- Keyboard shortcuts: Pause, resume, or cancel capture with simple key combinations
- Resume capability: Pick up where you left off if capture is interrupted
- Progress tracking: Shows progress bar with time estimates
- Module context awareness: Tracks presentation modules for better naming
- Batch processing: Capture and process entire presentations
- Single file processing: Rename individual screenshots
- Configurable settings: Adjust timing, file paths, and OCR parameters
- Comprehensive logging: Track all operations with customizable log levels
- macOS 11 (Big Sur) or newer
- Tesseract OCR for text extraction
- ImageMagick for image processing
If you don't have Homebrew installed, install it first:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Then install the required dependencies:
brew install tesseract
brew install imagemagick- Download the script from GitHub
- Save it to a location of your choice (e.g.,
~/Applications/Scripts/) - Make it executable:
chmod +x ~/Applications/Scripts/slide_capture.scpt
Run the script and select "Help & Documentation" from the menu. If the help screen appears, the script is installed correctly.
The script offers three main operating modes:
This mode automates the capture of multiple slides from a presentation:
- Open your presentation in a supported browser (Chrome, Safari, Firefox, Edge)
- Run the script and select "Capture Slides from Presentation"
- Select the browser containing your presentation (or confirm the auto-detected one)
- Enter the number of slides to capture
- Select an output folder
- Let the script run - it will:
- Enter full-screen presentation mode
- Capture each slide
- Extract text via OCR
- Intelligently name and save each slide
- Advance to the next slide automatically
Process and rename an existing screenshot:
- Run the script and select "Rename Single Screenshot"
- Select a PNG or JPG file
- The script will analyze the screenshot, extract the title, and rename it according to the detected slide type
Configure script options:
- Run the script and select "Settings & Configuration"
- Adjust parameters like:
- Log level
- Delay between slides
- Capture delay
- Maximum title length
- Save your settings or reset to defaults
During capture mode, you can use these keyboard shortcuts:
- Option+P: Pause capture
- Option+R: Resume capture
- Option+C: Cancel capture
The script can identify and specially name these slide types:
| Slide Type | Detection Method | Example Filename |
|---|---|---|
| Regular Content | Default | 01_Module_Title_Content.png |
| Cyber Lab | "Cyber Lab" in content | 02_Cyber_Lab_Configuring_Firewall.png |
| Knowledge Check | "Knowledge Check" in content | 03_Knowledge_Check_1.png |
| Knowledge Check Answer | "Knowledge Check Answer" | 04_Knowledge_Check_Answer_1.png |
| Pulse Check | "Pulse Check" in content | 05_Pulse_Check_Module.png |
| Summary | "Summary" in content | 06_Summary_Module.png |
| Break | "Break" in content | 07_Break.png |
| Real World Scenario | "Real World Scenario" in sidebar | 08_Real_World_Scenario_Module.png |
The script automatically detects your screen resolution and scales the capture regions accordingly. This ensures optimal performance on various display configurations:
- Standard resolution displays (1920×1080)
- Retina displays
- 4K/5K displays
- Ultrawide monitors
The scaling algorithm uses a base resolution of 3840×2160 to calculate proportions for different screen sizes.
The script stores configuration in a plist file at ~/Library/Preferences/com.slidecapture.config.plist.
Configurable parameters include:
| Parameter | Default | Description |
|---|---|---|
| logLevel | 1 | 0=Debug, 1=Info, 2=Warning, 3=Error |
| delayBetweenSlides | 1.5 | Seconds to wait between slide advances |
| captureDelay | 0.5 | Seconds to wait before capturing screenshot |
| maxTitleLength | 60 | Maximum length for slide titles |
| defaultNumSlides | 20 | Default number of slides when starting new capture |
| defaultOutputFolder | ~/Downloads/Slides | Default output location |
Symptoms: Slides are named "Untitled" or text is incorrectly extracted.
Solutions:
- Ensure your screen brightness is adequate
- Use presentations with good contrast
- Try adjusting capture delay to ensure slides are fully loaded
Symptoms: Script fails to advance slides or captures the same slide multiple times.
Solutions:
- Increase the delay between slides
- Ensure the presentation is in full-screen mode
- Check if keyboard focus is on the browser window
Symptoms: "Required tools not found" error message.
Solutions:
- Verify Tesseract and ImageMagick are installed:
which tesseractandwhich convert - Try reinstalling the dependencies:
brew reinstall tesseract imagemagick
Check the log file for detailed troubleshooting information:
~/Library/Logs/slide_capture.log
Increase log verbosity by setting logLevel to 0 (Debug) in Settings.
- Implemented functional keyboard handlers using Carbon framework
- Fixed configuration loading/saving functionality
- Enhanced error recovery with specific strategies for different error types
- Added Settings & Configuration mode
- Added image capture verification
- Improved slide advance recovery
- Added comprehensive inline documentation
- Added multi-browser support
- Implemented resume capability
- Added progress tracking with time estimation
- Enhanced OCR processing with multiple regions
- Initial release with basic slide capture and OCR
This project is licensed under the MIT License - see the LICENSE file for details.
© 2025 Your Organization. All rights reserved.
