Skip to content

tesseract ocr big size pic dump #3885

Open
@wuyang-dl

Description

hi,
void TessBaseAPI::SetImage(Pix *pix) API function has a coredump problem when handling a big size pic(system memory no enough)

void TessBaseAPI::SetImage(Pix *pix) {
if (InternalSetImage()) {
if (pixGetSpp(pix) == 4 && pixGetInputFormat(pix) == IFF_PNG) {
// remove alpha channel from png
Pix *p1 = pixRemoveAlpha(pix);
pixSetSpp(p1, 3);
(void)pixCopy(pix, p1); <---- bug
pixDestroy(&p1);
}
thresholder_->SetImage(pix);
SetInputImage(thresholder_->GetPixRect());
}
}

pixCopy(pix, p1) function in leptonica, return pixd, or NULL on error
so it is necessary to check pixCopy return val.

Environment

  • Tesseract Version: 5.2.0
  • Commit Number:
  • Platform: Windows10 32-bit, I think other platforms have the same problem

Current Behavior:

tesseract dump

Expected Behavior:

tesseract ocr ok(not dump)

Suggested Fix:

Possible fix:
void TessBaseAPI::SetImage(Pix *pix) {
if (InternalSetImage()) {
if (pixGetSpp(pix) == 4 && pixGetInputFormat(pix) == IFF_PNG) {
// remove alpha channel from png
Pix *p1 = pixRemoveAlpha(pix);
pixSetSpp(p1, 3);

  // fix-begin
   if ( pixCopy(pix, p1) == NULL) {
      pixDestroy(&p1);
      recognition_done_ = false;  //maybe
      return ;
  }
 // fix-end

  pixDestroy(&p1);
}
thresholder_->SetImage(pix);
SetInputImage(thresholder_->GetPixRect());

}
}

tks

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions