Heuristic based boilerplate removal tool
-
Updated
Feb 25, 2025 - Python
Heuristic based boilerplate removal tool
Undetected web-scraping & seamless HTML parsing in Python!
procyclingstats scraper
CAP (Common Alerting Protocol) XML alert format parsing, HTML parsing, inserting new alerts into database, OneSignal (possible Android and iOS push notifications), Twitter, Facebook, MailChimp (e-mail notifications) for project of open source solution for natural disasters early-warning.
BeautifulSoup4 packaged into a command line tool
This Python script scrapes internal links on a webpage. It prompts for a URL, sends a GET request to retrieve HTML, uses BeautifulSoup to parse and filter links. Then it prompts the user for output mode (terminal or file) to either print or write the links. Installs required modules (requests and beautifulsoup4) if not found.
this script can analyze number of telegram messages by time
django-janitor allows you to use bleach to clean HTML stored in a Model's field.
web spider to scan UR avialbe room and output as csv
Get insights into your Facebook Messenger activity with Splunk
The first public repository that provides free BUBT website scraping API script on Github.
CLI tool for sitemap generation
This Python script scrapes Salatomatic for US masjid data, including names, locations, and phone numbers. It uses requests, BeautifulSoup, and csv modules for web scraping and CSV handling.
A simple HTML form password bruteforcing tool written in python.
Simple example of a web scrapper using python. In this case, we ask the user using the console for the name of a band/artist and using selenium webdriver and beautifulsoup we print information about the discography of that artist/band
Examples on how to process html files in Python
⚡ Multi-threaded login brute-forcer with visual flair, hotkey control, token handling, and educational focus. Built for testing 2-step login flows (username → password). 🧠
A powerful desktop application to download, archive, and manage web pages locally with full resource support, built with Python and PyQt6.
Script for extracting data from site "dop.edu.ru"
Add a description, image, and links to the html-parsing topic page so that developers can more easily learn about it.
To associate your repository with the html-parsing topic, visit your repo's landing page and select "manage topics."