Skip to content

Tesseract 4.0 hangs when processing a particular image #2288

Open
@lewislun

Description

Environment

  • Tesseract Version: tesseract 4.0.0-beta.1
    leptonica-1.75.3
    libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
  • Platform: Ubuntu 18.04.1 LTS

Current Behavior:

hangs when running the following command:
tesseract failed-image.jpeg output.txt

output message:

Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 207

Tesseract does not stop nor give any message after that.
other images work fine, i only have trouble processing this particular image.
I have found that the image after processed by tesseract (or leptonica?) is weird, dont know if it is related.

failed-image.jpeg: https://drive.google.com/open?id=1HsgCbtuNpgf_XxzjkekXU9-uuiWDsV0H
tessinput.tif: https://drive.google.com/open?id=1sE8Nn5rykSWPT6PMF3nFSonPMT9y-H61

Expected Behavior:

Tesseract should either give an error message or finish ocr on the image even if the image quality is bad.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions