Skip to content

qduc/easyocr.js

Repository files navigation

easyocr.js

A complete native JavaScript port of EasyOCR for Node.js and the Browser.

Unlike wrapper libraries that spawn Python processes, easyocr.js implements the entire OCR pipeline natively in TypeScript — from image preprocessing to CRAFT text detection, CRNN recognition, and CTC decoding. It runs entirely in JavaScript environments using ONNX Runtime, with zero Python dependencies.

✨ Why easyocr.js?

Other JS EasyOCR Libraries easyocr.js
Architecture Python wrapper (spawns child processes) Native JavaScript implementation
Python Required Yes (Python 3.6+, pip, venv) No
Browser Support Node.js only Node.js + Browser
Performance Process spawn overhead Direct ONNX inference
Deployment Complex (Python + Node) Simple npm install

🚀 Key Strengths

  • 🥇 First True JS Port: The first complete native JavaScript/TypeScript implementation of EasyOCR — not a wrapper, but a full port of the detection and recognition pipeline.
  • 📊 Numerical Parity: Achieves equivalent accuracy to the Python reference implementation through comprehensive validation and tracing.
  • 🌐 Universal Runtime: Works seamlessly in both Node.js and the Browser using runtime-agnostic core logic with swappable backends.
  • ⚡ Hardware Acceleration: Leverages ONNX Runtime for CPU/GPU acceleration without platform-specific bindings.
  • 🔧 TypeScript First: Full type safety, modular architecture, and modern JavaScript patterns throughout.
  • 🎯 Model Compatible: Uses the same CRAFT detector and recognition models as the original Python EasyOCR (80+ languages supported).
  • 📦 Zero Python Dependencies: Deploy anywhere JavaScript runs — no Python installation, virtual environments, or pip packages required.

Installation

Understanding the Package Structure

  • @qduc/easyocr-core: Shared types, pipeline logic, and image processing (required for all runtimes)
  • @qduc/easyocr-node: Node.js runtime implementations using sharp for images and onnxruntime-node for inference
  • @qduc/easyocr-web: Browser runtime implementations using Canvas APIs and onnxruntime-web

When to use each:

  • Use Node.js for server-side OCR, CLI tools, or desktop applications with Node.js
  • Use Web for in-browser OCR without server dependencies

Node.js

npm install @qduc/easyocr-node @qduc/easyocr-core
# or
yarn add @qduc/easyocr-node @qduc/easyocr-core
# or
bun add @qduc/easyocr-node @qduc/easyocr-core

Requirements:

  • Node.js 16+ (18+ recommended)
  • sharp will be installed as a dependency (may require system libs on Linux)

Browser

npm install @qduc/easyocr-web @qduc/easyocr-core

Note: For browser usage, you need to host or point to the onnxruntime-web WASM files. See Browser Integration for details.

Quick Start (Node.js)

Basic Usage (Simplified)

import { createOCR } from '@qduc/easyocr-node';

async function run() {
  const ocr = await createOCR({
    modelDir: './models',
    lang: 'en', // or langs: ['en', 'ch_sim']
  });

  const results = await ocr.read('path/to/image.png');
  for (const item of results) {
    console.log(`Text: ${item.text}`);
    console.log(`Confidence: ${(item.confidence * 100).toFixed(1)}%`);
  }
}

run().catch(console.error);

Note: modelDir must contain onnx/ models and the matching .charset.txt files (see Getting the Models).

Advanced Usage (Manual Setup)

import {
  loadImage,
  loadDetectorModel,
  loadRecognizerModel,
  recognize,
  loadCharset
} from '@qduc/easyocr-node';

async function run() {
  // 1. Load your image (PNG, JPG, etc.)
  const image = await loadImage('path/to/image.png');

  // 2. Load the detector model
  // On first run, this auto-downloads from GitHub Releases (~200MB)
  // Subsequent runs use the cached copy from models/onnx/
  const detector = await loadDetectorModel('models/onnx/craft_mlt_25k.onnx');

  // 3. Load the recognizer model and charset
  // For English text:
  const charset = await loadCharset('models/english_g2.charset.txt');
  const recognizer = await loadRecognizerModel('models/onnx/english_g2.onnx', {
    charset,
    // textInputName is optional; most current recognizer ONNX exports only have a single image input.
  });

  // 4. Run OCR
  const results = await recognize({
    image,
    detector,
    recognizer,
  });

  // 5. Process results
  for (const item of results) {
    console.log(`Text: ${item.text}`);
    console.log(`Confidence: ${(item.confidence * 100).toFixed(1)}%`);
    // item.box is a 4-point polygon: [[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
  }
}

run().catch(console.error);

Using a Different Language

With createOCR, just pass the language(s) and the correct model is selected automatically:

const ocr = await createOCR({ modelDir: './models', langs: ['en', 'ch_sim'] });

To recognize text in a different language manually, load the corresponding model and charset:

// For Chinese (Simplified)
const charset = await loadCharset('models/zh_sim_g2.charset.txt');
const recognizer = await loadRecognizerModel('models/onnx/zh_sim_g2.onnx', {
  charset,
  // textInputName: '...', // optional (only if the ONNX model exposes a second "text"/token input)
});

// For Japanese
const charset = await loadCharset('models/japanese_g2.charset.txt');
const recognizer = await loadRecognizerModel('models/onnx/japanese_g2.onnx', {
  charset,
  // textInputName: '...', // optional
});

// See Supported Models section for all available languages

Configuration Options

The recognize function accepts an optional options object:

const results = await recognize({
  image,
  detector,
  recognizer,
  options: {
    langList: ['en', 'ch_sim'], // Filter characters by language
    allowlist: '0123456789',    // Only recognize these characters
    blocklist: 'XYZ',           // Exclude these characters
    paragraph: true,            // Combine results into paragraphs
    canvasSize: 2560,          // Max canvas dimension (default: 2560)
  },
});

Processing Multiple Images

async function processMultiple(imagePaths: string[]) {
  // Load models once
  const detector = await loadDetectorModel('models/onnx/craft_mlt_25k.onnx');
  const charset = await loadCharset('models/english_g2.charset.txt');
  const recognizer = await loadRecognizerModel('models/onnx/english_g2.onnx', {
    charset,
    // textInputName: '...', // optional
  });

  // Process images sequentially (for parallelism, use Promise.all with caution on memory)
  for (const path of imagePaths) {
    const image = await loadImage(path);
    const results = await recognize({ image, detector, recognizer });
    console.log(`${path}: ${results.map(r => r.text).join(' ')}`);
  }
}

Error Handling

import type { OcrResult } from '@qduc/easyocr-core';

try {
  const image = await loadImage('image.png');
  const results = await recognize({ image, detector, recognizer });

  // Filter high-confidence results
  const confident = results.filter(r => r.confidence > 0.5);
  console.log(`Found ${confident.length} confident detections`);
} catch (error) {
  if (error instanceof Error) {
    if (error.message.includes('ENOENT')) {
      console.error('Image file not found');
    } else if (error.message.includes('model')) {
      console.error('Failed to load model - check models/onnx/ directory');
    } else {
      console.error('OCR error:', error.message);
    }
  }
}

Understanding the Output Format

Each OcrResult contains:

  • text: Recognized text string
  • confidence: Confidence score (0.0 to 1.0)
  • box: 4-point polygon coordinates as [[x1,y1], [x2,y2], [x3,y3], [x4,y4]] in pixel coordinates relative to the original image

Browser Integration

Setup (React/Vue/Svelte example)

When using @qduc/easyocr-web, you need to handle WASM paths and runtime differences.

import * as ort from 'onnxruntime-web';
import { loadImage, recognize } from '@qduc/easyocr-web';
import type { RasterImage, OcrResult } from '@qduc/easyocr-core';

// Configure WASM path (choose one method):
// Method 1: Using CDN
ort.env.wasm.wasmPaths = 'https://cdn.jsdelivr.net/npm/onnxruntime-web@latest/dist/';

// Method 2: Host locally or use your bundler's output
ort.env.wasm.wasmPaths = '/path/to/your/wasm/';

// Usage is similar to Node, but loadImage works with different input types:
// - File objects from <input type="file">
// - Blob objects
// - HTMLImageElement
// - Canvas/OffscreenCanvas

async function recognizeFromFileInput(fileInput: File): Promise<OcrResult[]> {
  const image = await loadImage(fileInput);
  return recognize({ image, detector, recognizer });
}

Note: Model loading in browser requires either:

  1. Models hosted on a CORS-enabled server
  2. Bundled models using your build tool
  3. Pre-loaded ArrayBuffers passed to model loaders

Getting the Models

This package requires ONNX models and character sets to operate.

Model Manifest (Machine-Readable)

This repo ships a machine-readable model catalog at:

The manifest includes model metadata like modelName, languages, charsetFile, onnxFile, textInputName, sha256, and size, and is versioned by @qduc/easyocr-core via packageVersion so integrators can rely on stable availability per package version.

In Node.js (ESM), you can import it directly (Node 20+):

import manifest from '@qduc/easyocr-core/models/manifest.json' assert { type: 'json' };

console.log(manifest.packageVersion);
console.log(manifest.models.map((m) => m.modelName));

If your environment/bundler doesn’t support JSON import assertions, you can also load it as a file (Node) or fetch it (Web) from wherever you host it.

Programmatic Language + Model Selection (Core)

To avoid hardcoding language lists or re-implementing guessModel logic in every app:

import { getSupportedLanguages, resolveModelForLanguage } from '@qduc/easyocr-core';

// 1) List languages for UI dropdowns
const supported = getSupportedLanguages();

// 2) Resolve the right model + charset for a language code (aliases supported)
const resolved = resolveModelForLanguage('zh-cn');
// => { model: 'zh_sim_g2', charset: 'zh_sim_g2.charset.txt' }

Where Models Go

Models should be placed in the models/onnx/ directory (relative to your project root or the working directory when running the code):

project/
├── models/
│   ├── onnx/
│   │   ├── craft_mlt_25k.onnx          (detector, ~200MB)
│   │   ├── english_g2.onnx             (recognizer)
│   │   ├── zh_sim_g2.onnx              (recognizer)
│   │   └── ...
│   ├── english_g2.charset.txt
│   ├── zh_sim_g2.charset.txt
│   └── ...
└── src/
    └── your-app.ts

How to Get Models

1. Automatic Download (Node.js only)

@qduc/easyocr-node will automatically download missing models from GitHub Releases on first use:

// This will download models/onnx/craft_mlt_25k.onnx if not found locally
const detector = await loadDetectorModel('models/onnx/craft_mlt_25k.onnx');

2. Manual Download from Releases

Download .onnx and .charset.txt files from the GitHub Releases page and place them in the models/ directory.

3. Manual Export from PyTorch

If you have the original PyTorch weights from EasyOCR:

# 1. Set up Python environment
uv venv
source .venv/bin/activate

# 2. Install dependencies
uv pip install -r python_reference/requirements.txt

# 3. Export and validate models
python models/export_onnx.py --detector --recognizer --validate

See models/README.md for more export options.

Supported Models

The table below is a human-readable summary. For a canonical list (including hashes + sizes), use the manifest: models/manifest.json.

Language Recognition Model Charset File Notes
English english_g2.onnx english_g2.charset.txt Default model
Latin latin_g2.onnx latin_g2.charset.txt Covers European languages
Chinese (Simplified) zh_sim_g2.onnx zh_sim_g2.charset.txt Mainland China
Japanese japanese_g2.onnx japanese_g2.charset.txt Hiragana, Katakana, Kanji
Korean korean_g2.onnx korean_g2.charset.txt Hangul
Cyrillic cyrillic_g2.onnx cyrillic_g2.charset.txt Russian, Ukrainian, etc.
Telugu telugu_g2.onnx telugu_g2.charset.txt South Indian language
Kannada kannada_g2.onnx kannada_g2.charset.txt South Indian language

Detector: All languages use the same detector craft_mlt_25k.onnx (multi-lingual text).

Web: Canonical Model Base URL + LFS Pointer Safety

In the browser you must host models on a CORS-enabled origin. This project exports a default, CORS-safe base URL and a fetch helper that detects Git LFS pointer files:

import {
  fetchModel,
  getDefaultModelBaseUrl,
  loadDetectorModel,
  loadRecognizerModel,
  loadCharset,
  resolveModelForLanguage,
} from '@qduc/easyocr-web';

const baseUrl = getDefaultModelBaseUrl();
const { model, charset, textInputName } = resolveModelForLanguage('ja');

// Use fetchModel() so you get a clear error if a URL returns a Git LFS pointer.
const detectorBytes = await fetchModel(`${baseUrl}/onnx/craft_mlt_25k.onnx`);
const recognizerBytes = await fetchModel(`${baseUrl}/onnx/${model}.onnx`);

const detector = await loadDetectorModel(detectorBytes);
const charsetText = await loadCharset(`${baseUrl}/${charset}`);
const recognizer = await loadRecognizerModel(recognizerBytes, { charset: charsetText, textInputName });

Avoid raw.githubusercontent.com for .onnx files tracked with Git LFS; it can return a tiny pointer file rather than the binary.

Repository Structure

  • packages/core: Runtime-agnostic types, pipeline logic, image processing, and post-processing. Use this for type definitions.

  • packages/node: Node.js implementations using sharp for image loading and onnxruntime-node for inference.

  • packages/web: Browser implementations using Canvas APIs and onnxruntime-web.

  • examples: Sample code for Node.js and browser usage.

  • models: Model assets and export scripts (see models/README.md).

  • python_reference: Original EasyOCR implementation and validation tools.

TypeScript Types

If you need to work with types (e.g., for custom implementations):

import type {
  RasterImage,     // Image data with width, height, channels
  OcrResult,       // Detection result with text, confidence, box
  DetectorModel,   // Loaded detector model
  RecognizerModel, // Loaded recognizer model
  OcrOptions,      // Recognition options
  Box,             // 4-point polygon coordinates
  Point,           // [x, y] coordinate
} from '@qduc/easyocr-core';

See packages/core/src/types.ts for full type definitions.

Development

This is a monorepo using Bun workspaces.

Getting Started

# Install all dependencies
bun install

# Build all packages (TypeScript → dist/)
bun run build

# Run tests
bun run test

Updating the Model Manifest

If you add/remove models in models/, regenerate the manifest:

node scripts/generate_models_manifest.mjs

The release process also regenerates this manifest so model availability stays tied to the published @qduc/easyocr-core version.

Working on a Single Package

# Build only @qduc/easyocr-node
bun run -F @qduc/easyocr-node build

# Test only @qduc/easyocr-core
bun run -F @qduc/easyocr-core test

Running Examples

# Node.js example (requires built packages)
node examples/node-ocr.mjs <image-path>

# Or with TypeScript directly
bun examples/node-ocr.ts <image-path>

Debugging

View debug traces:

Both Python reference and JS implementation can emit detailed traces for comparison:

# Generate JS trace
EASYOCR_DEBUG=1 node examples/node-ocr.mjs image.png > debug_output/js/trace.json

# Generate Python trace
python python_reference/trace_easyocr.py image.png > debug_output/py/trace.json

# Compare traces
python python_reference/validation/diff_traces.py debug_output/py/trace.json debug_output/js/trace.json

See python_reference/validation/README.md for detailed validation instructions.

Common Issues:

Problem Solution
Models not found Make sure models/onnx/ directory exists and run from repo root
ONNX Runtime errors Ensure correct Node/browser version; check onnxruntime-node compatibility
Image loading fails Verify image format (PNG, JPG, WebP) and path; try absolute paths
Low accuracy Check if using correct language model and charset; try disabling langList filter

FAQ

Q: Why do I need both @qduc/easyocr-core and @qduc/easyocr-node (or @qduc/easyocr-web)?

A: @qduc/easyocr-core contains the shared types and pipeline logic (runtime-agnostic). The node/web packages provide runtime-specific implementations for loading images and running inference.

Q: Can I use this in production?

A: Yes, but be aware of performance considerations. OCR is computationally intensive; on CPU, expect 1-5s per image depending on size. GPU acceleration via ONNX Runtime can improve performance significantly.

Q: Is this a Python wrapper?

A: No. Unlike wrapper libraries that spawn Python processes (requiring Python installation), easyocr.js is a complete native port. Every stage — image preprocessing, CRAFT text detection, box merging, perspective warping, CRNN recognition, and CTC decoding — is implemented in TypeScript and runs via ONNX Runtime.

Q: How accurate is this compared to the Python version?

A: The JS port achieves numerical parity with Python EasyOCR. We maintain a comprehensive validation harness that compares intermediate pipeline outputs (heatmaps, boxes, logits) against the Python reference to ensure identical behavior. See python_reference/validation/README.md for methodology.

Q: Can I use custom models?

A: Currently, only CRAFT (detector) and the provided g2 (recognizer) models are supported. Custom models require code changes.

Q: Why is the first run slow?

A: On the first run, models are downloaded from GitHub Releases (100-300MB total) and optionally quantized. Subsequent runs use cached models.

Q: How do I handle multiple languages in one image?

A: Use the langList option to specify multiple language codes. The recognizer will attempt to recognize characters from all specified languages. Note: You still need to load a single recognizer model; the language filter only affects which characters are accepted.

Q: Can I run OCR in a Worker thread (Node.js or Browser)?

A: Yes, but ONNX Runtime sessions may not be sharable across threads. Test thoroughly and consider creating separate model instances per worker.

Q: What are the system requirements?

A:

  • Node.js: 16+ (18+ recommended)
  • Browser: Modern browsers with Canvas and WebAssembly support (Chrome 74+, Firefox 79+, Safari 14+)
  • Memory: 500MB+ recommended (more for large images or parallel processing)

License

Apache-2.0 (Matches the original EasyOCR license).