Skip to content

Repair and secure untrusted HTML

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

rust-ammonia/ammonia

Repository files navigation

HTML Sanitization

Crates.IO Requires rustc 1.30.0

Ammonia is a whitelist-based HTML sanitization library. It is designed to prevent cross-site scripting, layout breaking, and clickjacking caused by untrusted user-provided HTML being mixed into a larger web page.

Ammonia uses html5ever to parse and serialize document fragments the same way browsers do, so it is extremely resilient to syntactic obfuscation.

Ammonia parses its input exactly according to the HTML5 specification; it will not linkify bare URLs, insert line or paragraph breaks, or convert (C) into ©. If you want that, use a markup processor before running the sanitizer, like pulldown-cmark.

Installation

To use ammonia, add it to your project's Cargo.toml file:

[dependencies]
ammonia = "2"

Changes

Please see the CHANGELOG for a release history.

Example

Using pulldown-cmark together with Ammonia for a friendly user-facing comment site.

extern crate pulldown_cmark;
extern crate ammonia;
use pulldown_cmark::{push_html, Parser};
use ammonia::clean;
let text = "[a link](http://www.notriddle.com/)";
let mut md_parse = Parser::new_ext(text, OPTION_ENABLE_TABLES);
let mut unsafe_html = String::new();
push_html(&mut unsafe_html, md_parse);
let safe_html = clean(&*unsafe_html);
assert_eq!(safe_html, "<a href=\"http://www.notriddle.com/\">a link</a>");

Performance

Ammonia builds a DOM, traverses it (replacing unwanted nodes along the way), and serializes it again. It could be faster for what it does, and if you don't want to allow any HTML it is possible to be even faster than that.

However, it takes about fifteen times longer to sanitize an HTML string using bleach-2.0.0 with html5lib-0.999999999 than it does using Ammonia 1.0.

$ cd benchmarks
$ cargo run --release
    Running `target/release/ammonia_bench`
87539 nanoseconds to clean up the intro to the Ammonia docs.
$ python bleach_bench.py
(1498800.015449524, 'nanoseconds to clean up the intro to the Ammonia docs.')

License

Licensed under either of these:

Thanks

Thanks to the other sanitizer libraries, particularly Bleach for Python and sanitize-html for Node, which we blatantly copied most of our API from.

Thanks to ChALkeR, whose Improper Markup Sanitization document helped us find high-level semantic holes in Ammonia, and to ssokolow, whose review and experience were also very helpful.

And finally, thanks to the contributors.