Skip to content

FGRibreau/node-unidecode

Repository files navigation

Unidecode for NodeJS


Version Downloads extra

Twitter Follow available-for-advisory Get help on Codementor Slack

Unidecode is JavaScript port of the perl module Text::Unidecode. It takes UTF-8 data and tries to represent it in US-ASCII characters (i.e., the universally displayable characters between 0x00 and 0x7F). The representation is almost always an attempt at transliteration -- i.e., conveying, in Roman letters, the pronunciation expressed by the text in some other writing system.

See Text::Unidecode for the original README file, including methodology and limitations.

Note that all the files named 'x??.js' in data are derived directly from the equivalent perl file, and both sets of files are distributed under the perl license not the BSD license.

❤️ Shameless plug

Installation

$ npm install unidecode

Usage

$ node
> var unidecode = require('unidecode');
> unidecode("aéà)àçé");
'aea)ace'
> unidecode("に間違いがないか、再度確認してください。再読み込みしてください。");
'niJian Wei iganaika, Zai Du Que Ren sitekudasai. Zai Du miIp misitekudasai. '

Advanced Usage

Custom Substitution Values

For values that cannot be translated, empty strings are returned. You can override this behavior by passing a custom substitution value as the second argument to unidecode:

$ node
> var unidecode = require('unidecode');
> unidecode("ab\uFFFFc", "X");
'abXc'
> unidecode("ab\uFFFFc");
'abc'

Donate

I maintain this project in my free time, if it helped you please support my work via paypal or bitcoins, thanks a lot!

I accept pull-request !