Skip to content

Taskmaster-1/Historical-Text-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Historical Text Recognition (RenAIssance Project)

This project implements an OCR (Optical Character Recognition) system specifically designed for historical text recognition, focusing on early modern printed sources. The system uses a combination of deep learning architectures to accurately transcribe historical documents while handling various challenges like layout variations and text embellishments.

Project Structure

.
├── data/               # Dataset storage
│   ├── raw/           # Original PDF scans
│   ├── processed/     # Processed images and annotations
│   └── transcriptions/# Reference transcriptions
├── notebooks/         # Jupyter notebooks for analysis
├── src/              # Source code
│   ├── data/         # Data processing utilities
│   ├── models/       # Model architectures
│   ├── training/     # Training scripts
│   └── utils/        # Utility functions
├── outputs/          # Model outputs and results
└── requirements.txt  # Project dependencies

Setup

  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Model Architecture

The OCR system uses a hybrid architecture combining:

  • A CNN backbone for feature extraction
  • A Transformer encoder for sequence modeling
  • A CTC decoder for text recognition

Key features:

  • Layout-aware text detection
  • Robust handling of historical fonts and styles
  • Support for early modern English text variations

Evaluation Metrics

The model is evaluated using:

  • Character Error Rate (CER)
  • Word Error Rate (WER)
  • Layout detection accuracy
  • Processing speed

Usage

  1. Data Preparation:
python src/data/prepare_data.py
  1. Training:
python src/training/train.py
  1. Inference:
python src/inference.py

Results

Model performance metrics and visualizations are stored in the outputs/ directory.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors