Basic Whatsapp Scraper for TMD Hackathon 2019 built with Python and Selenium
Scrapes urls shared in group chats from Whatsapp web.
Requires logging in using mobile QR code scanner via the Whatsapp app.
Run with python scraper.py
If not using a mac, you will need to download the appropriate chromedriver and update the "start driver" function. https://sites.google.com/a/chromium.org/chromedriver/home
Preliminary Ethical Guidelines:
1 - Do not collect or store people's names, phone numbers, e-mails or other personal identifying information.
2 - Hash images and only save them if they meet certain criteria (e.g. shared over 1000 times, contains text.)
3 - Use public WhatsApp groups only.
4 - Do not quote individuals in stories without their permission unless it is of clear public interest (e.g. Trump participating in discussion).
5 - Do not use deception. If anyone asks you directly who you are, say that you are a journalist and explain your project.
6 - Do not use the tool in any way that would violate WhatsApp's Terms of Service. Example: this tool cannot be used for commercial purposes. Regularly check for changes to the TOS.