Skip to content

cvframeiq/Tesseract_OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tesseract_OCR Web Application

A Flask-based web application that extracts text from uploaded images using Optical Character Recognition (OCR), supporting both Hindi and English languages.

Features

  • Bilingual OCR: Extracts text from images containing both Hindi and English text
  • Image Preprocessing: Applies denoising and thresholding to improve OCR accuracy
  • Web Interface: User-friendly web interface for uploading images and viewing results
  • Secure File Handling: Uses secure filename handling to prevent directory traversal attacks

Prerequisites

Before running this application, ensure you have the following installed:

  • Python 3.7+
  • Tesseract OCR engine
  • Hindi language data for Tesseract

Installing Tesseract OCR

Windows:

  1. Download Tesseract installer from UB-Mannheim/tesseract
  2. Run the installer
  3. Add Tesseract to your system PATH
  4. Download Hindi language data (hin.traineddata) and place it in the Tesseract tessdata directory

macOS:

brew install tesseract
brew install tesseract-lang

Linux (Ubuntu/Debian):

sudo apt install tesseract-ocr
sudo apt install tesseract-ocr-hin

Installation

Clone this repository:

git clone <your-repo-url>
cd <repository-directory>

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required dependencies:

pip install -r requirements.txt

Update the Tesseract path in the code if necessary:

python
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'  # Update this path

Usage

Start the Flask application:

python app.py

Open your web browser and navigate to http://localhost:5000

Upload an image containing text (Hindi, English, or both) View the extracted text on the results page

Project Structure

├── app.py                 # Main Flask application
├── templates/
│   ├── index.html        # Home page with upload form
│   └── result.html       # Results display page
├── static/               # Directory for uploaded images
├── requirements.txt      # Python dependencies
└── README.md            # This file

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published