Analyze text: count characters, words, sentences, paragraphs, and reading time.
# npm
npm install text-analyzer
# pnpm
pnpm add text-analyzer
# bun
bun add text-analyzerimport {
countCharacters,
countLines,
countParagraphs,
countSentences,
countSequenceOccurrences,
countWords,
getAverageWordLength,
getReadingTime,
getWordFrequency,
} from "text-analyzer"Count the number of characters in a text.
options.unit:"grapheme"(default) counts user-perceived characters (e.g. the emoji"👨👩👧"counts as 1)."code-unit"counts UTF-16 code units, matchingString.prototype.length.options.locale: BCP 47 locale tag passed toIntl.Segmenter. Only used whenunitis"grapheme".options.normalize: whentrue, normalize the text to NFC before counting. Defaults tofalse.
countCharacters("text") // 4
countCharacters("👨👩👧") // 1
countCharacters("👨👩👧", { unit: "code-unit" }) // 8Count the number of words in a text. Words are separated by any whitespace.
Note: punctuation stays attached, so "hello, world" counts as 2 words
("hello," and "world"). For a linguistic word count, use
getWordFrequency.
countWords("one two three") // 3
countWords(" one\ttwo\r\nthree ") // 3Count the number of lines in a text. Handles \n, \r\n, and \r. A
trailing line terminator does not add an extra empty line.
countLines("one\ntwo\nthree") // 3
countLines("one\n") // 1Count the number of sentences using Intl.Segmenter, so decimals and
abbreviations don't accidentally split a sentence.
options.locale: BCP 47 locale tag passed toIntl.Segmenter.
countSentences("Hello. World!") // 2
countSentences("The value is 3.14. Done.") // 2Count the number of paragraphs. Paragraphs are separated by one or more blank lines.
countParagraphs("one\n\ntwo\n\n\nthree") // 3Count the number of times a sequence occurs in a text.
options.caseSensitive: defaults totrue.options.overlapping: whentrue, overlapping matches are counted (e.g."aa"matches 3 times in"aaaa"). Defaults tofalse.options.locale: BCP 47 locale tag used for case folding (only relevant whencaseSensitiveisfalse).options.normalize: whentrue, normalize bothtextandsequenceto NFC before searching. Defaults tofalse.
countSequenceOccurrences("dolor Dolor dolor", "dolor") // 2
countSequenceOccurrences("dolor Dolor dolor", "dolor", { caseSensitive: false }) // 3
countSequenceOccurrences("aaaa", "aa") // 2
countSequenceOccurrences("aaaa", "aa", { overlapping: true }) // 3Count how many times each word occurs in a text. Words are detected with
Intl.Segmenter, so punctuation is excluded and contractions are kept as one
word. Returns a Map<string, number> sorted by count in descending order.
options.caseSensitive: defaults totrue. Passfalsefor typical natural-language frequency analysis where"The"and"the"should be treated as the same word.options.locale: BCP 47 locale tag passed toIntl.Segmenterand used for case folding.
getWordFrequency("The cat sat on the mat.")
// Map { "The" => 1, "cat" => 1, "sat" => 1, "on" => 1, "the" => 1, "mat" => 1 }
getWordFrequency("The cat sat on the mat.", { caseSensitive: false })
// Map { "the" => 2, "cat" => 1, "sat" => 1, "on" => 1, "mat" => 1 }Compute the average length of words in a text. Returns 0 when the text
contains no words. Word splitting is whitespace-based, matching countWords.
options.unit: passed tocountCharacters("grapheme"by default).options.locale: passed tocountCharacters.
getAverageWordLength("aa bbb cccc") // 3Estimate the reading time for a text.
options.wordsPerMinute: reading speed. Must be greater than0. Defaults to200.
getReadingTime("one two three")
// { words: 3, minutes: 0.015, milliseconds: 900 }
getReadingTime("one two three", { wordsPerMinute: 100 })
// { words: 3, minutes: 0.03, milliseconds: 1800 }MIT