A robust, multi-layered security suite for sanitizing and hardening URLs within user-generated content, markdown, and modern web application pipelines.
This monorepo provides the core harden-urls utility and a set of integration packages for popular ecosystems, ensuring your application is protected against malicious links, tracking parameters, and protocol evasion techniques.
Basic URL sanitizers often fail against modern threats because they rely on simple prefix checks. This suite offers a multi-layered defense that includes:
- Strict Protocol Control: Allows only safe schemes (e.g.,
https:,mailto:). - Tracking Cleanup: Strips common query parameters (
utm_,fbclid,gclid). - Obfuscation Defense: Applies Unicode normalization (NFKC) to defeat homoglyph and control-character attacks.
- Pattern-Based Filtering: Allows explicit domain whitelisting or blocklisting.
All packages live under the libs/ directory. You can use them independently or together for end-to-end security.
| Package | Description | Ecosystem | Status |
|---|---|---|---|
harden-urls |
The core utility for deep cleaning and sanitizing individual URL strings. Dependency-free and highly performant. | Core Utility | Available |
rehype-harden-urls |
A rehype plugin to enforce policies on <a> and <img> URLs within HTML Abstract Syntax Trees (ASTs). |
Unified / Rehype | Available |
harden-react-markdown-urls |
A Higher-Order Component (HOC) for react-markdown that transparently integrates rehype-harden-urls. |
React / Markdown | Available |
remark-harden-urls |
A remark plugin for hardening URLs directly within the Markdown AST before conversion to HTML. | Unified / Remark | Coming Soon |
Install the core library and any integration package you need:
pnpm add harden-urls rehype-harden-urls
or
# Core utility and a popular integration
npm install harden-urls rehype-harden-urlsConfigure once, use everywhere. This is the foundation for the entire suite.
import { createUrlSanitizer, toRegexps } from "harden-urls";
const trustedDomains = toRegexps(["*.mycorp.com", "partner-api.io"]);
const sanitizer = createUrlSanitizer({
allowedProtocols: ["https:", "mailto:"],
allowedPatterns: trustedDomains,
stripParams: ["utm_", "fbclid"],
});
sanitizer(
"[https://sub.mycorp.com/path?utm_source=email](https://sub.mycorp.com/path?utm_source=email)"
);
// → "[https://sub.mycorp.com/path](https://sub.mycorp.com/path)" (cleaned tracking param)
sanitizer("javascript:alert('xss')");
// → null (blocked by protocol whitelist)Use in your Node.js or build-time pipelines (e.g., Gatsby, Next.js).
import { rehypeHardenUrls } from "rehype-harden-urls";
import { presets } from "rehype-harden-urls/utils";
// Use the 'balanced' preset for links and 'strict' for images
.use(rehypeHardenUrls, {
link: presets.balanced,
image: presets.strict,
})Key features: Automatically adds rel="noopener noreferrer" to external links. Can be configured to remove elements entirely (prune: true) or replace them with a safer placeholder (prune: false).
A drop-in Higher-Order Component for the popular react-markdown library.
import ReactMarkdown from "react-markdown";
import { hardenReactMarkdown } from "harden-react-markdown-urls";
import { presets } from "rehype-harden-urls/utils";
// Wrap ReactMarkdown with the desired default policy
const HardenedMarkdown = hardenReactMarkdown(ReactMarkdown, presets.balanced);
function MyComponent({ markdownText }) {
return (
<HardenedMarkdown
// Instance-level override for maximum control
hardenedOptions={{
link: { allowedProtocols: new Set(["https:"]) },
onUnsafeUrl: url => console.warn("Blocked:", url),
}}>
{markdownText}
</HardenedMarkdown>
);
}Crucial: These packages primarily focus on sanitizing the URL value (href or src). They do not replace a general HTML sanitizer.
If you process untrusted content or allow embedded HTML (e.g., using rehype-raw in your markdown pipeline), you MUST pair this with an HTML structure sanitizer.
The recommended secure chain is:
rehype-raw(if allowing raw HTML)rehype-harden-urls(Deep URL Content Cleaning)rehype-sanitize(Structural Guardrail)
Your strongest defense starts with a minimal, explicit list of allowedProtocols, but security risks persist even within whitelisted protocols:
A very brief example below
| Protocol | Basic Risk (Protocol Evasion) | Advanced Risk (Phishing/Malicious Data) | harden-urls Mitigation |
|---|---|---|---|
javascript: |
XSS (Cross-Site Scripting) | Malicious code execution. | Blocks entirely by default. |
mailto: |
Spambot links, mail client exploits. | Targeted Phishing: Malicious headers (BCC, Subject, Body) can be injected via query parameters to trick users into sending unwanted/damaging emails. |
Strips malicious query parameters (subject, body, etc.) and non-mail protocols via Query Param Cleaning. |
data: |
XSS, large payload DDoS. | Can carry executable scripts or large, resource-consuming payloads disguised as images. | Requires explicit whitelisting and should be tightly restricted to specific media types (e.g., data:image/png). |
Action: Only allow what you absolutely need (e.g., https: and mailto:). If you allow mailto:, rely on this library's stripParams feature to neutralize potential phishing payloads embedded in the query string.
Gotcha: When providing custom RegExp objects for patterns, do not use the global (/g) flag. The test() method with /g maintains state, which can cause security checks to be incorrectly skipped. Use harden-urls/toRegexps to safely convert patterns.
We welcome contributions of all kinds—from reporting bugs and suggesting new features to submitting code. Your feedback helps make web content safer for everyone!
- Fork this repository.
- Open an Issue to discuss the feature or fix.
- Submit a Pull Request against the
mainbranch.
💖 Adopt and Support: If this suite helps secure your application, please give us a star on GitHub!
This project is licensed under the MIT License.
MIT © Mayank Chaudhari
Inspired by the Unified, Rehype, and Vercel Labs communities.
with 💖 by Mayank Kumar Chaudhari
