-
Notifications
You must be signed in to change notification settings - Fork 3
Usage
Goran Školjarev edited this page Feb 16, 2017
·
8 revisions
bloshi uses two components:
- scraper: component for parsing data
- bloshi: component for storing, viewing and working with data
scraper uses so-called spiders to parse data.
bloshi stores spider data and data parsed by them.
Add shop data:
- Add availabilities
- Add categories
- Create a shop
- Add shop availabilities: connect to availibilities; add keyword identifier in parsed data
- Add shop categories: connect to categories; add URL from which to parse data
- Add spider: connect to shop; add XPaths for next page, selector, and fields;
Add spider data:
- Connect to shop
- Add initial request URL: for example, for switching to list or grid view
- Add XPaths for next page & selector
- Add field data: input, output processors, type, Xpath
- Check parse detailed info if needed; if enabled, spider will follow item URL and parse fields with 'detail' type; note that this slows down parsing of data
workon bloshiproject
cd scraper
Parse & save data for shop with code VMN
scrapy crawl spider -a save=1 -a shop=VMN
Parse & save data for shop with code AHL and category with code PC
scrapy crawl spider -a category=PC -a save=1 -a shop=AHL
Parse data for shop with code AHL (for testing only, parsing without saving)
scrapy crawl spider -a category=PC -a save=0 -a save_temp=0 -a shop=AHL
workon bloshiproject
cd bloshi
python manage.py runserver 0.0.0.0:8888
View interface at: http://localhost:8888/admin/