Skip to content

πŸ“š A Python tool to scrape book titles and prices by category from Books to Scrape. Save results in TXT or JSON format.

License

Notifications You must be signed in to change notification settings

AdemCE-eng/Bookscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

BookScraper πŸ“š

BookScraper is a Python command-line tool that scrapes book titles and prices from Books to Scrape. It allows you to fetch books from a specific category and save the results in a well-formatted text file or a structured JSON file.


πŸ”§ Features

  • Scrapes book titles and prices by category
  • Uses requests, BeautifulSoup, re, and tabulate for web scraping and formatting
  • Saves results in either .txt (pretty table) or .json format
  • User-friendly and interactive command-line interface
  • Supports UTF-8 encoding for file outputs

πŸ“¦ Requirements

Make sure you have Python 3.6+ installed. Then, install the required libraries:

pip install requests beautifulsoup4 tabulate

πŸš€ Usage

Run the script in your terminal:

python book_scraper.py

You will be prompted to:

  1. Enter the book category you want to scrape (e.g., travel, humor, novels, etc.).
  2. Choose whether to save the results as a .txt file.
  3. Choose whether to save the results as a .json file.

🧠 Notes

  • If the category name has two words, use a dash - between them. Example:

    • βœ… science-fiction
    • βœ… historical-fiction
  • The category name must exist on Books to Scrape.

  • Important: You must set your own User-Agent inside the script before running it.
    To do this, replace the user-agent field inside the headers dictionary with your real browser's User-Agent string.

Example:

headers = {
    'user-agent': 'your actual User-Agent here'
}

Tip: To find your User-Agent, you can visit https://www.whatismybrowser.com/ and copy the string.

  • The script will not work without a valid User-Agent.

πŸ“ Output Examples

πŸ“„ Text file (TXT):

Type: travel
Date: 27/04/2025
================================================

╒════════════════════════════════╀════════════╕
β”‚ Title                          β”‚ Price      β”‚
β•žβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•ͺ════════════║
β”‚ It's Only the Himalayas        β”‚ Β£45.17     β”‚
β”‚ Full Moon over Noah's Ark      β”‚ Β£49.43     β”‚
β”‚ See America: A Celebration...  β”‚ Β£48.87     β”‚
β•˜β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•§β•β•β•β•β•β•β•β•β•β•β•β•β•›

πŸ—‚οΈ JSON file:

[
  { "Title": "It's Only the Himalayas", "Price": "Β£45.17" },
  { "Title": "Full Moon over Noah's Ark", "Price": "Β£49.43" },
  { "Title": "See America: A Celebration...", "Price": "Β£48.87" }
]

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

About

πŸ“š A Python tool to scrape book titles and prices by category from Books to Scrape. Save results in TXT or JSON format.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages