Add OCR to your PDF books and documents easily using Google Colab , this will help you search text content or copy text form PDFs made from scanned images.
-
Set variables in the first Cell
- make sure
original_pdf
matches the pdf's file name. - set correct
lang_code
(ara = Arabic, eng = English , jpn = Japanese, ..etc ) more codes here - (optional) you can set first and last pages to ocr only a range/ chapter ..etc
- make sure
-
Upload the pdf or uncomment
# !wget
in the second cell and set the correct url.wget
will make sure to set the correct name to the PDF when downloaded . -
DONE ! Run all cells
Runtime > Run all
or run them cell by cell without skipping .
- Easy setup.
- Basic Error checking .
- Shows Progress.