Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters
-
Updated
Dec 19, 2025 - Python
Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters
A Minimal Yet Powerful Crawler for Extracting all The Internal/External/Fuzz-able Links from a website
A simple GIT URL parser.
A type to represent, query, and manipulate a Uniform Resource Identifier.
Web scraping | Website cloner | Path Traversal Scanner
This is a website url scraper built using python.
Extract information from URLs inside shell scripts
Check if the urls contained in a markdown file are down or not.
High-Concurrency URL Web Page Content Fetching Service. 高并发URL网页内容抓取服务。
Simple URL builder
WebBriefs is an intelligent webpage summarizer API that extracts and condenses content into concise, readable markdown format. Perfect for quickly getting the gist of any website
Crawl websites and extract meaningful information from HTML and site content
A command line url parser, written in Python
Collection of helper functions designed to facilitate efficient web scraping in python
A CLI tool for URL deduplication, filtering, and flexible extraction of parameters, domains, subdomains, and paths.
Find URLs in a text string
Bot to generate useful links to increase the ranking of products sold on Amazon
A Python tool to efficiently process, modify, and deduplicate URL lists. Ideal for security professionals, analysts, and developers, with both CLI and GUI support.
Add a description, image, and links to the url-parser topic page so that developers can more easily learn about it.
To associate your repository with the url-parser topic, visit your repo's landing page and select "manage topics."