Skip to content

NPM Module which strips out all JavaScript code from some HTML text, modified for Windows use

License

Notifications You must be signed in to change notification settings

burtonsys/strip-js

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

strip-js

NPM Version Travis Build

[DEPRECATED] This repository has been deprecated. I don't have enough time to work on it.

NPM Module which strips out all JavaScript code from some HTML text

This module performs the following tasks:

  • Sanitizes HTML
  • Removes script tags
  • Removes attributes such as "onclick", "onerror", etc. which contain JavaScript code
  • Removes "href" attributes which contain JavaScript code
  • Removes "action" attributes from form tags

An example use case of this module is to sanitize HTML emails before displaying them in a browser to prevent cross-site scripting attacks.

Installation

npm install strip-js

This module can also be used from the command line. Install it globally using the following command:

sudo npm install -g strip-js

Usage

The following input HTML ...

<html>
   <body>
      <script src="foo.js"></script>
      <img src="image.gif" onerror="stealSession(document.cookie)" foo="bar">
      <a href="javascript:stealSession(document.cookie)" target="_blank">Dangerous Link</a>
      <a href="http://www.google.com" target="_blank">Safe Link</a>
      <form action="steal_cookies.php" foo="bar"></form>
      <p>
         This is some text in a p tag, but the p tag is not closed!
   </body>
</html>

... is converted to the following:

<html>
   <body>
      <img src="image.gif" foo="bar">
      <a target="_blank">Dangerous Link</a>
      <a href="http://www.google.com" target="_blank">Safe Link</a>
      <form foo="bar"></form>
      <p>
         This is some text in a p tag, but the p tag is not closed!
      </p>
   </body>
</html>

Using this module is easy!

var stripJs = require('strip-js');
var fs = require('fs');
var html = fs.readFileSync('./webpage.html').toString();
var safeHtml = stripJs(html); // It returns plain HTML text

If you need to preserve doctypes, use var safeHtml = stripJs(html, { preserveDoctypes: true });. preserveDoctypes defaults to false.

For command line usage, install it globally. It reads the input HTML from its stdin and outputs the result to stdout.

strip-js < input.html

Warnings

Some old browsers have XSS vulnerabilities in CSS, as mentioned in the browser security handbook:

The risk of JavaScript execution. As a little-known feature, some CSS implementations permit JavaScript code to be embedded in stylesheets. There are at least three ways to achieve this goal: by using the expression(...) directive, which gives the ability to evaluate arbitrary JavaScript statements and use their value as a CSS parameter; by using the url('javascript:...') directive on properties that support it; or by invoking browser-specific features such as the -moz-binding mechanism of Firefox.

This module does not remove any JavaScript from CSS, so it is recommended that you enforce one of the following browsers in your web app:

  • Edge
  • IE11
  • FF3
  • Safari
  • Chrome
  • Android

All these browsers are safe in that they don't allow JavaScript execution in CSS. Please feel free to add more browsers to this list after testing them, and send a pull request.

About

NPM Module which strips out all JavaScript code from some HTML text, modified for Windows use

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 94.0%
  • HTML 6.0%