Skip to content

A personal tool using Python's Scrapy framework to scrape Best Buy's product pages for RTX 3080 TIs and notify if available/not sold out.

License

Notifications You must be signed in to change notification settings

gamemann/BestBuy-Parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BestBuy Parser

Description

My first project using Python's Scrapy framework.

I'm using this project personally for a couple friends of mine and I. Basically, it scrapes a products listing page from BestBuy that lists RTX 3080 TIs. It scans each product and if the c-button-disable class doesn't exist within each entry (indicating it is not sold out and available), it will email a list of users from the settings.py file. It keeps each ID tracked in SQLite to make sure users don't get emailed more than once.

Requirements

The Scrapy framework is required and may be installed with the following.

python3 -m pip install scrapy

Settings

Settings are configured in the src/bestbuy_parser/bestbuy_parser/settings.py file. The following are defaults.

# General Scrapy settings.
BOT_NAME = 'bestbuy_parser'

SPIDER_MODULES = ['bestbuy_parser.spiders']
NEWSPIDER_MODULE = 'bestbuy_parser.spiders'

TELNETCONSOLE_ENABLED = False
LOG_LEVEL = 'ERROR'

# The User Agent used to crawl.
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'

# Obey robots.txt rules
ROBOTSTXT_OBEY = True

# Best Buy Parser-specific settings.

# The email's subject to send.
MAIL_SUBJECT = "RTX 3080 TI In Stock On Best Buy!"

# Where the email is coming from.
MAIN_FROM = "test@domain.com"

# The email body.
MAIL_BODY = '<html><body><ul><li><a href="https://www.bestbuy.com{link}">{name}</a></li><li>{price}</li></ul></body></html>'

# Recipients to send to.
MAIL_TO = [
    'test@domain2.com'
]

# If any items exceed this price and are labeled as available, users will not be notified on this product.
MAX_PRICE = 1500.00

# How often to scan in seconds.
SCAN_TIME = 5.0

# Whether to print to `stdout` when a product isn't sold out/valid and a user is being emailed.
PRINT_WHEN_NOT_SOLDOUT = True

Running The Program

You must change the working directory to src/bestbuy_parser/bestbuy_parser via cd. Afterwards, you may run the following.

python3 parse.py

This will run the program until a keyboard interrupt.

Systemd Service

A systemd service is included in the systemd/ directory. It is assuming you cloned the repository into /usr/src (you will need to change the systemd file if this is not correct).

You may install the systemd service via the following command as root (or ran with sudo).

sudo make install

Credits

About

A personal tool using Python's Scrapy framework to scrape Best Buy's product pages for RTX 3080 TIs and notify if available/not sold out.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published