Tesseract_OCR Web Application

A Flask-based web application that extracts text from uploaded images using Optical Character Recognition (OCR), supporting both Hindi and English languages.

Features

Bilingual OCR: Extracts text from images containing both Hindi and English text
Image Preprocessing: Applies denoising and thresholding to improve OCR accuracy
Web Interface: User-friendly web interface for uploading images and viewing results
Secure File Handling: Uses secure filename handling to prevent directory traversal attacks

Prerequisites

Before running this application, ensure you have the following installed:

Python 3.7+
Tesseract OCR engine
Hindi language data for Tesseract

Installing Tesseract OCR

Windows:

Download Tesseract installer from UB-Mannheim/tesseract
Run the installer
Add Tesseract to your system PATH
Download Hindi language data (hin.traineddata) and place it in the Tesseract tessdata directory

macOS:

brew install tesseract
brew install tesseract-lang

Linux (Ubuntu/Debian):

sudo apt install tesseract-ocr
sudo apt install tesseract-ocr-hin

Installation

Clone this repository:

git clone <your-repo-url>
cd <repository-directory>

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required dependencies:

pip install -r requirements.txt

Update the Tesseract path in the code if necessary:

python
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'  # Update this path

Usage

Start the Flask application:

python app.py

Open your web browser and navigate to http://localhost:5000

Upload an image containing text (Hindi, English, or both) View the extracted text on the results page

Project Structure

├── app.py                 # Main Flask application
├── templates/
│   ├── index.html        # Home page with upload form
│   └── result.html       # Results display page
├── static/               # Directory for uploaded images
├── requirements.txt      # Python dependencies
└── README.md            # This file

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
templates		templates
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tesseract_OCR Web Application

Features

Prerequisites

Installing Tesseract OCR

Windows:

macOS:

Linux (Ubuntu/Debian):

Installation

Clone this repository:

Create a virtual environment and activate it:

Install required dependencies:

Update the Tesseract path in the code if necessary:

Usage

Open your web browser and navigate to http://localhost:5000

Project Structure

About

Uh oh!

Releases

Packages

Languages

License

cvframeiq/Tesseract_OCR

Folders and files

Latest commit

History

Repository files navigation

Tesseract_OCR Web Application

Features

Prerequisites

Installing Tesseract OCR

Windows:

macOS:

Linux (Ubuntu/Debian):

Installation

Clone this repository:

Create a virtual environment and activate it:

Install required dependencies:

Update the Tesseract path in the code if necessary:

Usage

Open your web browser and navigate to http://localhost:5000

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages