Skip to content

Latest commit

 

History

History
28 lines (22 loc) · 428 Bytes

README.md

File metadata and controls

28 lines (22 loc) · 428 Bytes

web-scrape-indexer

Scrapes websites using Scrapy and makes it searchable in Elasticsearch

How to run?

Run Indexer

cd indexer
./gradlew build
./gradlew bootRun

Run Crawler

In a separate terminal

cd scrapper
poetry shell
cd simple-scrapper
scrapy crawl news-spider
deactivate

Validate

Open http://localhost:8080/swagger-ui/index.html

Play around the endpoints

alt text