Closed
Description
Description of the bug
Trying to convert the pixmaps of some pdfs fails. The reason for this is how the colorspace is determined in pil_save
:
cspace = self.colorspace
if cspace is None:
mode = "L"
elif cspace.n == 1:
mode = "L" if self.alpha == 0 else "LA"
elif cspace.n == 3:
mode = "RGB" if self.alpha == 0 else "RGBA"
else:
mode = "CMYK"
The extracted pixmaps have the following colorspace definition:
Changing the first if to if cspace is None or cspace.name == "None":
seems to fix the problem already.
How to reproduce the bug
Unfortunately, I'm not able to provide a sample pdf.
The pixmaps have been extracted using
import fitz
img_list = page.get_images()
ocr_results = []
for img in img_list:
xref = img[0]
pix = fitz.Pixmap(doc, xref)
image = self._convert_to_image(pix)
PyMuPDF version
1.25.1
Operating system
Linux
Python version
3.12