🖼️ Unsplash Image Scraper

Download free high-quality images from Unsplash with ease

English • Español

English

📋 Table of Contents

Features
Prerequisites
Installation
Usage
Configuration
Project Structure
Examples
Troubleshooting
Contributing
License
Acknowledgments

✨ Features

🔍 Smart Search - Search for any topic and download related images
🚀 Automated Scrolling - Automatically loads more images to meet your requirements
📦 Batch Download - Download multiple images in one go
⚙️ Configurable - Easy to customize settings and parameters
🎯 CLI Support - Both command-line and interactive modes
📝 Comprehensive Logging - Track the scraping process with detailed logs
🧹 Clean Code - Well-structured, documented, and following PEP 8 standards
🔒 Error Handling - Robust error handling for network issues and timeouts
🎨 Free License Only - Only downloads images with free licenses from Unsplash

🔧 Prerequisites

Before you begin, ensure you have the following installed:

Python 3.8 or higher (Download Python)
Google Chrome Browser (latest version recommended)
ChromeDriver - Will be automatically managed by Selenium

Note: This scraper uses Selenium WebDriver which will automatically download and manage ChromeDriver for you.

📥 Installation

Clone the repository

git clone https://github.com/p0sadas/unsplash-image-scraper.git
cd unsplash-image-scraper

Create a virtual environment (recommended)

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements.txt

🚀 Usage

Interactive Mode

Simply run the main script without arguments:

python main.py

You'll be prompted to enter:

Search query (e.g., "mountains", "technology", "animals")
Number of images to download

Command-Line Mode

# Basic usage (runs in headless mode by default)
python main.py -q "cats" -n 10

# With custom output directory
python main.py -q "nature" -n 25 -o "my_images"

# Show browser window (disable headless mode)
python main.py -q "technology" -n 15 --no-headless

Available Arguments

Argument	Short	Description	Required
`--query`	`-q`	Search query (e.g., 'cat', 'nature')	No*
`--num-images`	`-n`	Number of images to download	No*
`--output`	`-o`	Output directory (default: downloads)	No
`--no-headless`	-	Show browser window (headless is default)	No
`--help`	`-h`	Show help message	No

*If not provided, interactive mode will be used.

⚙️ Configuration

You can customize the scraper behavior by modifying src/config.py:

# Timeouts
WEBDRIVER_TIMEOUT = 20  # seconds
SCROLL_PAUSE_TIME = 0.3  # seconds between scrolls

# Output
DOWNLOAD_DIR = BASE_DIR / "downloads"
IMAGE_FORMAT = "jpg"

# Logging
LOG_LEVEL = "INFO"  # DEBUG, INFO, WARNING, ERROR

📁 Project Structure

unsplash-image-scraper/
├── src/
│   ├── __init__.py           # Package initialization
│   ├── config.py             # Configuration settings
│   └── unsplash_scraper.py   # Main scraper class
├── downloads/                # Downloaded images (created automatically)
├── main.py                   # Entry point script
├── requirements.txt          # Python dependencies
├── .gitignore               # Git ignore rules
├── LICENSE                  # MIT License
└── README.md               # This file

💡 Examples

Example 1: Download Cat Images

python main.py -q "cats" -n 20

Output:

🔍 Searching for 'cats'...
📊 Target: 20 images
📁 Output: C:\path\to\downloads

✅ Found 20 images
📥 Downloading images...

✨ Successfully downloaded 20 images!
📂 Images saved to: C:\path\to\downloads

Example 2: Using as a Python Module

from src.unsplash_scraper import UnsplashScraper
from pathlib import Path

# Create scraper instance
with UnsplashScraper(headless=True) as scraper:
    # Scrape image URLs
    urls = scraper.scrape_images("mountains", num_images=10)

    # Download images
    output = Path("my_mountains")
    scraper.download_images(urls, output_dir=output)

print(f"Downloaded {len(urls)} images!")

Example 3: Run with Browser Visible

# Show the browser window (useful for debugging)
python main.py -q "abstract art" -n 30 --no-headless

🔍 Troubleshooting

Issue: "ChromeDriver not found"

Solution: Selenium 4.16+ automatically manages ChromeDriver. Ensure you have the latest version:

pip install --upgrade selenium

Issue: "TimeoutException"

Solution: This usually means the page took too long to load. Try:

Increasing WEBDRIVER_TIMEOUT in src/config.py
Checking your internet connection
Ensuring Unsplash is accessible in your region

Issue: "No images found"

Solution:

Try a different search query
Ensure you're searching for topics that exist on Unsplash
Check if Unsplash has changed their page structure (XPath selectors may need updating)

Issue: "Download fails for some images"

Solution: This is normal - some images may be temporarily unavailable. The scraper will log errors and continue with other images.

🤝 Contributing

Contributions are welcome! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Please ensure your code follows PEP 8 style guidelines and includes appropriate documentation.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

⚠️ Disclaimer

This tool is for educational purposes only. Please respect Unsplash's Terms of Service and API Guidelines. Always give credit to photographers when using their images.

🙏 Acknowledgments

Unsplash for providing free high-quality images
Selenium for web automation capabilities
The open-source community for inspiration and support

Made with ❤️ by Angel Posadas

⭐ Star this repo if you found it helpful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🖼️ Unsplash Image Scraper

Download free high-quality images from Unsplash with ease

English

📋 Table of Contents

✨ Features

🔧 Prerequisites

📥 Installation

🚀 Usage

Interactive Mode

Command-Line Mode

Available Arguments

⚙️ Configuration

📁 Project Structure

💡 Examples

Example 1: Download Cat Images

Example 2: Using as a Python Module

Example 3: Run with Browser Visible

🔍 Troubleshooting

Issue: "ChromeDriver not found"

Issue: "TimeoutException"

Issue: "No images found"

Issue: "Download fails for some images"

🤝 Contributing

📄 License

⚠️ Disclaimer

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.es.md		README.es.md
README.md		README.md
banner.png		banner.png
main.py		main.py
requirements.txt		requirements.txt

License

p0sadas/Professional-Unsplash-Image-Scraper

Folders and files

Latest commit

History

Repository files navigation

🖼️ Unsplash Image Scraper

Download free high-quality images from Unsplash with ease

English

📋 Table of Contents

✨ Features

🔧 Prerequisites

📥 Installation

🚀 Usage

Interactive Mode

Command-Line Mode

Available Arguments

⚙️ Configuration

📁 Project Structure

💡 Examples

Example 1: Download Cat Images

Example 2: Using as a Python Module

Example 3: Run with Browser Visible

🔍 Troubleshooting

Issue: "ChromeDriver not found"

Issue: "TimeoutException"

Issue: "No images found"

Issue: "Download fails for some images"

🤝 Contributing

📄 License

⚠️ Disclaimer

🙏 Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages