Concurrent Web Scraping with Python and Selenium

Want to learn how to build this project?

Check out the blog post.

Want to use this project?

Fork/Clone
Create and activate a virtual environment
Install the requirements

Run the scrapers:

# sync
(env)$ python script.py headless

# parallel with multiprocessing
(env)$ python script_parallel_1.py headless

# parallel with concurrent.futures
(env)$ python script_parallel_2.py headless

# concurrent with concurrent.futures (should be the fastest!)
(env)$ python script_concurrent.py headless

# parallel with concurrent.futures and concurrent with asyncio
(env)$ python script_asyncio.py headless

Run the tests:

(env)$ python -m pytest test/test_scraper.py
(env)$ python -m pytest test/test_scraper_mock.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concurrent Web Scraping with Python and Selenium

Want to learn how to build this project?

Want to use this project?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
scrapers		scrapers
test		test
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
script.py		script.py
script_asyncio.py		script_asyncio.py
script_concurrent.py		script_concurrent.py
script_parallel_1.py		script_parallel_1.py
script_parallel_2.py		script_parallel_2.py

Folders and files

Latest commit

History

Repository files navigation

Concurrent Web Scraping with Python and Selenium

Want to learn how to build this project?

Want to use this project?

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages