WikipediaScraper

WikipediaScraper is a Python-based project designed to extract and process data from Wikipedia. It leverages the Scrapy framework to perform web scraping tasks efficiently.

Features

Web Scraping with Scrapy: Utilizes Scrapy to navigate and extract information from Wikipedia pages.
Docker Support: Includes a Dockerfile and docker-compose.yml for containerized deployment.
Automated Execution: Comes with a run.sh script to streamline the scraping process.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
wikipedia_scraper		wikipedia_scraper
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.sh		run.sh
scrapy.cfg		scrapy.cfg
seed_urls.txt		seed_urls.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WikipediaScraper

Features

About

Releases

Packages

Contributors 4

Languages

hmoorerg/WikipediaScraper

Folders and files

Latest commit

History

Repository files navigation

WikipediaScraper

Features

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages