Normalize a URL
-
Updated
Mar 10, 2024 - JavaScript
Normalize a URL
Extract and decompose URLs (including emails, which are conceptually a part of URLs) with robust patterns.
Remove clutter from URLs and return a canonicalized version
🔗🧹 Normalize URLs to a standardized form. HTTPS by default, flexible configuration, custom protocols, domain extraction, humazing URL, and punycode support. Both CJS & ESM modules available.
Natural Language Precessing related notebooks (Machine Learning)
A simple plugin to rewrite multiple domains or subdomains to a single domain in the site's HTML output.
URL normalizer to canonicalize (standardize) the text representation of a URL to determine if differently-formatted URLs are identical
Get a stable, canonical version of any URL, with DNS and HTTPS checks, redirects, tracker stripping, and canonical link extraction!
http url normalization for web crawlers
Allows you to remove ad/tracking query params from a given URL in Scala
A robust, modular web crawler built in Python for extracting and saving content from websites. This crawler is specifically designed to extract text content from both HTML and PDF files, saving them in a structured format with metadata.
Add a description, image, and links to the url-normalization topic page so that developers can more easily learn about it.
To associate your repository with the url-normalization topic, visit your repo's landing page and select "manage topics."