Skip to content

A Wikipedia scraper (written in Python using Scrapy) that inserts wikipedia page data into a MongoDB database

Notifications You must be signed in to change notification settings

hmoorerg/WikipediaScraper

Repository files navigation

WikipediaScraper

WikipediaScraper is a Python-based project designed to extract and process data from Wikipedia. It leverages the Scrapy framework to perform web scraping tasks efficiently.

Features

  • Web Scraping with Scrapy: Utilizes Scrapy to navigate and extract information from Wikipedia pages.
  • Docker Support: Includes a Dockerfile and docker-compose.yml for containerized deployment.
  • Automated Execution: Comes with a run.sh script to streamline the scraping process.

About

A Wikipedia scraper (written in Python using Scrapy) that inserts wikipedia page data into a MongoDB database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •