🚀 Professional Browser OCR System

A high-performance, multi-engine OCR (Optical Character Recognition) system that runs entirely in your browser. No server uploads, no privacy concerns—just fast, secure text extraction.

✨ Key Features

🔒 Privacy First: All processing happens locally on your device. Images never leave your browser.
⚡ Multi-Engine Strategy: Choose the best engine for your needs:
- Tesseract.js: The industry standard for general-purpose OCR.
- Transformers.js (TrOCR): State-of-the-art AI accuracy using Transformer models.
- eSearch-OCR (PaddleOCR): High-speed, high-accuracy engine optimized for Chinese/English mixed text.
- EasyOCR.js: EasyOCR models running locally with ONNX Runtime.
🌍 Local Translation: Translate extracted text instantly using Bergamot (the engine behind Firefox Translations), keeping everything 100% private and on-device.
🖼️ Write-Back Quality Presets: Choose fast, balanced, or high-quality translated image rendering modes based on speed vs visual fidelity.
🔋 Performance Optimized: Uses WebAssembly (WASM), Web Workers, and WebGPU acceleration for near-native speeds.
📦 Intelligent Caching: Heavy model files are cached in IndexedDB for instant subsequent loads.
🎨 Glassmorphism UI: A modern, clean interface with drag-and-drop, URL, and paste support.

🛠️ OCR Engines Comparison

Engine	Best For	Tech Stack	Model Size
Tesseract.js	General use, 100+ languages	WASM	~4.3 MB (eng/fast)
Transformers.js	Highest accuracy, modern AI	WebGPU / ONNX	~40-150 MB
eSearch-OCR	Chinese/English, complex layouts	ONNX Runtime	~7-10 MB
EasyOCR.js	Multilingual OCR with EasyOCR	ONNX Runtime	~110 MB

🚀 Getting Started

Prerequisites

Modern Browser:
- Basic Support (WASM/Workers): Chrome 92+, Firefox 79+, Safari 15.2+ (required for SharedArrayBuffer)
- WebGPU Acceleration: Chrome 113+, Firefox 121+, Safari 17+
Node.js: v18 or higher recommended

Installation

Clone the repository:

git clone https://github.com/your-repo/multi-engine-browser-ocr.git
cd multi-engine-browser-ocr

Install dependencies:
```
npm install
```
Start development server:
```
npm run dev
```

🌍 Model Loading

Most engines download their models automatically from CDNs (Hugging Face or Tesseract CDN) on their first run and cache them locally. Translation models are also downloaded on demand.

eSearch-OCR Manual Setup (Optional for Offline)

By default, eSearch-OCR fetches models from Hugging Face. If you need to use it offline or host models yourself:

Download models from eSearch-OCR releases.
Place det.onnx, rec.onnx, and ppocr_keys_v1.txt into public/models/esearch/.

📂 Project Structure

src/engines/: Implementation of different OCR strategies.
src/translation/: Bergamot-based translation implementation.
src/utils/: Image processing, feature detection, translation utilities, and model caching.
src/types/: Shared TypeScript interfaces.
tests/: Comprehensive test suite using Vitest.
docs/: Technical specifications and decision logs.

🧪 Development Commands

npm run dev: Start Vite development server.
npm run build: Build for production.
npm test: Run all tests once.
npm run test:watch: Run tests in watch mode.
npm run lint: Check for code style issues.
npm run format: Automatically fix formatting.

�️ Privacy & Security

This application is designed with security as a core principle:

No Data Collection: Your images are processed entirely in the local browser context. No data is sent to external servers or APIs.
Offline Capability: Once the models are cached, the engine can function without an active internet connection.
Open Source: The entire pipeline is transparent and verifiable.

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md (if available) or simply open a Pull Request.

📖 Documentation

Technical Specification: Deep dive into architecture and design.
Decision Log: Rationale behind technical choices.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
.husky		.husky
docs		docs
public		public
scripts		scripts
src		src
tests		tests
.eslintignore		.eslintignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.eslint.json		tsconfig.eslint.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Professional Browser OCR System

✨ Key Features

🛠️ OCR Engines Comparison

🚀 Getting Started

Prerequisites

Installation

🌍 Model Loading

eSearch-OCR Manual Setup (Optional for Offline)

📂 Project Structure

🧪 Development Commands

�️ Privacy & Security

🤝 Contributing

📖 Documentation

About

Uh oh!

Releases

Packages

Languages

License

qduc/ocr

Folders and files

Latest commit

History

Repository files navigation

🚀 Professional Browser OCR System

✨ Key Features

🛠️ OCR Engines Comparison

🚀 Getting Started

Prerequisites

Installation

🌍 Model Loading

eSearch-OCR Manual Setup (Optional for Offline)

📂 Project Structure

🧪 Development Commands

�️ Privacy & Security

🤝 Contributing

📖 Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages