Skip to content

This project provides a JavaScript dataset and helper code for recognizing and categorizing text by Unicode script. It maps characters to their respective writing systems (e.g., Latin, Arabic, Cyrillic, Devanagari, etc.) based on the Unicode 10.0 database and related sources.

Notifications You must be signed in to change notification settings

Sparklybadge024/Recognizing-text

Repository files navigation

Unicode Script Recognition

This project provides a JavaScript dataset and helper code for recognizing and categorizing text by Unicode script.
It maps characters to their respective writing systems (e.g., Latin, Arabic, Cyrillic, Devanagari, etc.) based on the Unicode 10.0 database and related sources.

📖 Features

  • Detects which script a given character belongs to.
  • Includes metadata for each script:
    • Name (e.g., Arabic, Latin, Cyrillic)
    • Unicode ranges
    • Writing direction (ltr, rtl, ttb)
    • Approximate year of origin
    • Whether the script is still in use
    • Reference link (Wikipedia)
  • Supports over 140+ world writing systems.

🚀 Usage

Example

// Example: find which script a character belongs to
function characterScript(code) {
  for (let script of SCRIPTS) {
    if (script.ranges.some(([from, to]) => code >= from && code < to)) {
      return script;
    }
  }
  return null;
}

console.log(characterScript("শ".charCodeAt(0)));
// → { name: "Bengali", direction: "ltr", year: 1050, ... }

Detect script direction
let sample = "مرحبا"; // Arabic text
let script = characterScript(sample.codePointAt(0));
console.log(`Direction: ${script.direction}`);
// → Direction: rtl


Applications

Language/text recognition

Syntax highlighting

Internationalization (i18n)

Font rendering

Digital humanities research

About

This project provides a JavaScript dataset and helper code for recognizing and categorizing text by Unicode script. It maps characters to their respective writing systems (e.g., Latin, Arabic, Cyrillic, Devanagari, etc.) based on the Unicode 10.0 database and related sources.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published