Skip to content

A Python/Selenium scraper that extracts urls shared in groupchats via Whatsapp Web.

Notifications You must be signed in to change notification settings

richardqhill/whatsapp-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Basic Whatsapp Scraper for TMD Hackathon 2019 built with Python and Selenium

Scrapes urls shared in group chats from Whatsapp web.

Requires logging in using mobile QR code scanner via the Whatsapp app.

Run with python scraper.py

If not using a mac, you will need to download the appropriate chromedriver and update the "start driver" function. https://sites.google.com/a/chromium.org/chromedriver/home

Preliminary Ethical Guidelines:

1 - Do not collect or store people's names, phone numbers, e-mails or other personal identifying information.

2 - Hash images and only save them if they meet certain criteria (e.g. shared over 1000 times, contains text.)

3 - Use public WhatsApp groups only.

4 - Do not quote individuals in stories without their permission unless it is of clear public interest (e.g. Trump participating in discussion).

5 - Do not use deception. If anyone asks you directly who you are, say that you are a journalist and explain your project.

6 - Do not use the tool in any way that would violate WhatsApp's Terms of Service. Example: this tool cannot be used for commercial purposes. Regularly check for changes to the TOS.

About

A Python/Selenium scraper that extracts urls shared in groupchats via Whatsapp Web.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages