Deploy Tesseract to AWS Elastic Beanstalk.

Introduction

The repo gives the necessary steps to set the latest Tesseract OCR engine (3.04.01) on a AWS EC virtual machine. Alternatively, you can copy tess-deploy.sh script, then run for once. sudo bash tess-deploy.sh 😁.

[1] SSH to your EC instance

ssh <environment_name>
sudo yum update

[2] Dependencies

sudo yum install autoconf aclocal automake
sudo yum install libtool
sudo yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel

[3] Install Leptonica

cd ~/libs
mkdir leptonica && cd leptonica
wget http://www.leptonica.com/source/leptonica-1.73.tar.gz
tar -zxvf leptonica-1.73.tar.gz
rm leptonica-1.73.tar.gz
cd leptonica-1.73
./configure
make
sudo make install

[4] Install Tesseract

cd ~
mkdir tesseract && cd tesseract
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.01.tar.gz
tar -zxvf 3.04.01.tar.gz
rm 3.04.01.tar.gz
cd tesseract-3.04.01
./autogen.sh
./configure
make
sudo make install
sudo ldconfig

[5] Tesseract Training Data.

cd /usr/local/share/tessdata
sudo wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz
sudo tar xvf tesseract-ocr-3.02.eng.tar.gz
sudo rm tesseract-ocr-3.02.eng.tar.gz
export TESSDATA_PREFIX=/usr/local/share/
sudo mv tesseract-ocr/tessdata/* .

[6] Source TESSERACT_PREFIX

nano ~/.bash_profile

Then Copy this line to the end:

export TESSDATA_PREFIX=/usr/local/share/

[7] Verify

tesseract

Notes

(1) - Use grab-train-langs.sh to obtain all language training files, or customize as your needs.

Credits

Alan Gunning, author of the original blog post.
shantanusingh, author of Tesseract-Amazon-AMI gist.
Abdullah Barrak upgrade, and shell scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md
grab-train-langs.sh		grab-train-langs.sh
tess-deploy.sh		tess-deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploy Tesseract to AWS Elastic Beanstalk.

Introduction

[1] SSH to your EC instance

[2] Dependencies

[3] Install Leptonica

[4] Install Tesseract

[5] Tesseract Training Data.

[6] Source TESSERACT_PREFIX

Then Copy this line to the end:

[7] Verify

Notes

Credits

About

Releases

Packages

Languages

nishisahlot/tesseract-on-aws

Folders and files

Latest commit

History

Repository files navigation

Deploy Tesseract to AWS Elastic Beanstalk.

Introduction

[1] SSH to your EC instance

[2] Dependencies

[3] Install Leptonica

[4] Install Tesseract

[5] Tesseract Training Data.

[6] Source TESSERACT_PREFIX

Then Copy this line to the end:

[7] Verify

Notes

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages