Tesseract.jl

This Julia packages provides support for performing OCR on scanned images. This is done by using the Tesseract C library. Tesseract.jl tries to provide a direct mapping of the Tesseract API to Julia with additional functionality added to fit better into the Julia ecosystem.

using Tesseract

# Generate some pages to load.
write("page01.tiff", sample_tiff())
write("page02.tiff", sample_tiff())
write("page03.tiff", sample_tiff())

# Download the Tesseract English data files
download_languages("eng")

# Initialize the library to generate a text file.
instance = TessInst("eng")
pipeline = TessPipeline(instance)

tess_pipeline_text(pipeline, "My Book.txt")

# Process all the pages in the book.
tess_run_pipeline(pipeline, "My First Book") do add
    add(pix_read("page01.tiff"), 72)
    add(pix_read("page02.tiff"), 72)
    add(pix_read("page03.tiff"), 72)
end

# The results will be saved in "My Book.txt".
println("My Book.txt: $(filesize("My Book.txt")) bytes.")

# output

My Book.txt: 123

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md
docs.sh		docs.sh
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tesseract.jl

About

Releases 2

Packages

Contributors 2

Languages

License

pixel27/Tesseract.jl

Folders and files

Latest commit

History

Repository files navigation

Tesseract.jl

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages