Check out the blog post.
-
Fork/Clone
-
Create and activate a virtual environment
-
Install the requirements
-
Run the scrapers:
# sync (env)$ python script.py headless # parallel with multiprocessing (env)$ python script_parallel_1.py headless # parallel with concurrent.futures (env)$ python script_parallel_2.py headless # concurrent with concurrent.futures (should be the fastest!) (env)$ python script_concurrent.py headless # parallel with concurrent.futures and concurrent with asyncio (env)$ python script_asyncio.py headless
-
Run the tests:
(env)$ python -m pytest test/test_scraper.py (env)$ python -m pytest test/test_scraper_mock.py