Skip to content

vivasa/book-ocr

Repository files navigation

OCR App (Backend + React UI)

This repo is now a small monorepo:

  • backend/: Flask + Tesseract OCR API (Cloud Run-friendly)
  • frontend/: original React (Vite) UI (single image → /extract)
  • frontend-v2/: “Book OCR (no login)” workflow UI (PDF/images → per-page OCR → proofread → export)

For production deployment, the recommended approach is decoupled:

  • frontend-v2/ on Firebase Hosting
  • backend/ on Cloud Run

See:

Local Development

1) Backend (API)

cd backend
export PORT=8080
export DISABLE_QUOTA=1
./venv/bin/python app.py

2) Frontend (React dev server)

cd frontend
npm install
npm run dev

Open http://localhost:5173/.

frontend/vite.config.js proxies /extract to http://localhost:8080 during development.

Tests (Backend)

cd backend
./venv/bin/python -m pytest -q

Deployment

Recommended runtime env vars:

  • DAILY_LIMIT: daily request limit
  • OCR_LANGUAGE: default OCR language
  • FIRESTORE_DATABASE: Firestore DB name
  • DISABLE_QUOTA: set to 0 in production

API

POST /extract

  • Upload: multipart/form-data with image
  • Optional: lang query param (tel, kan, hin, eng)

Response:

{ "status": "success", "text": "..." }

About

Helping digitize India's rich literary heritage, one page at a time. An open workflow for crowd-sourced OCR of Telugu, Kannada, Hindi, and English books — import, extract, proofread, and export with ease.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors