A high-performance, multi-engine OCR (Optical Character Recognition) system that runs entirely in your browser. No server uploads, no privacy concerns—just fast, secure text extraction.
- 🔒 Privacy First: All processing happens locally on your device. Images never leave your browser.
- ⚡ Multi-Engine Strategy: Choose the best engine for your needs:
- Tesseract.js: The industry standard for general-purpose OCR.
- Transformers.js (TrOCR): State-of-the-art AI accuracy using Transformer models.
- eSearch-OCR (PaddleOCR): High-speed, high-accuracy engine optimized for Chinese/English mixed text.
- EasyOCR.js: EasyOCR models running locally with ONNX Runtime.
- 🌍 Local Translation: Translate extracted text instantly using Bergamot (the engine behind Firefox Translations), keeping everything 100% private and on-device.
- 🖼️ Write-Back Quality Presets: Choose
fast,balanced, orhigh-qualitytranslated image rendering modes based on speed vs visual fidelity. - 🔋 Performance Optimized: Uses WebAssembly (WASM), Web Workers, and WebGPU acceleration for near-native speeds.
- 📦 Intelligent Caching: Heavy model files are cached in IndexedDB for instant subsequent loads.
- 🎨 Glassmorphism UI: A modern, clean interface with drag-and-drop, URL, and paste support.
| Engine | Best For | Tech Stack | Model Size |
|---|---|---|---|
| Tesseract.js | General use, 100+ languages | WASM | ~4.3 MB (eng/fast) |
| Transformers.js | Highest accuracy, modern AI | WebGPU / ONNX | ~40-150 MB |
| eSearch-OCR | Chinese/English, complex layouts | ONNX Runtime | ~7-10 MB |
| EasyOCR.js | Multilingual OCR with EasyOCR | ONNX Runtime | ~110 MB |
- Modern Browser:
- Basic Support (WASM/Workers): Chrome 92+, Firefox 79+, Safari 15.2+ (required for
SharedArrayBuffer) - WebGPU Acceleration: Chrome 113+, Firefox 121+, Safari 17+
- Basic Support (WASM/Workers): Chrome 92+, Firefox 79+, Safari 15.2+ (required for
- Node.js: v18 or higher recommended
-
Clone the repository:
git clone https://github.com/your-repo/multi-engine-browser-ocr.git cd multi-engine-browser-ocr -
Install dependencies:
npm install
-
Start development server:
npm run dev
Most engines download their models automatically from CDNs (Hugging Face or Tesseract CDN) on their first run and cache them locally. Translation models are also downloaded on demand.
By default, eSearch-OCR fetches models from Hugging Face. If you need to use it offline or host models yourself:
- Download models from eSearch-OCR releases.
- Place
det.onnx,rec.onnx, andppocr_keys_v1.txtintopublic/models/esearch/.
src/engines/: Implementation of different OCR strategies.src/translation/: Bergamot-based translation implementation.src/utils/: Image processing, feature detection, translation utilities, and model caching.src/types/: Shared TypeScript interfaces.tests/: Comprehensive test suite using Vitest.docs/: Technical specifications and decision logs.
npm run dev: Start Vite development server.npm run build: Build for production.npm test: Run all tests once.npm run test:watch: Run tests in watch mode.npm run lint: Check for code style issues.npm run format: Automatically fix formatting.
This application is designed with security as a core principle:
- No Data Collection: Your images are processed entirely in the local browser context. No data is sent to external servers or APIs.
- Offline Capability: Once the models are cached, the engine can function without an active internet connection.
- Open Source: The entire pipeline is transparent and verifiable.
Contributions are welcome! Please see CONTRIBUTING.md (if available) or simply open a Pull Request.
- Technical Specification: Deep dive into architecture and design.
- Decision Log: Rationale behind technical choices.