Description
Describe the bug (mandatory)
Starting with version 1.22.0
, I'm seeing the following exception when calling page.get_drawings()
on one of our PDF files.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x90 in position 0: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<...>/pdf_test.py", line 67, in <module>
main()
File "<...>/pdf_test.py", line 60, in main
page.get_cdrawings()
File "<...>/lib/python3.9/site-packages/fitz/fitz.py", line 6612, in get_cdrawings
val = _fitz.Page_get_cdrawings(self, extended, callback, method)
SystemError: <built-in function Page_get_cdrawings> returned a result with an error set
But I do not get any error with previous versions like 1.21.1
.
To Reproduce (mandatory)
I'm a bit stuck here as unfortunately I cannot share the PDF in question because it's sensitive, and I've been struggling to create a new PDF that would mimic the issue.
Is there any chance you could provide some guidance on how to isolate the drawing issue?
So far I tried to copy the failing drawing content stream to a new PDF using version 1.21.1
, and so that I can potentially post it here, but the newly created PDF has no issue with 1.22.0
+....
Here is my script for copying the stream
doc = fitz.open(fp)
page = doc[0]
xref_content = page.get_contents()
# >> in this case = [4]
stream = doc.xref_stream(xref_content[0])
# >> returning bytes: b' BT /F2 11.000 Tf ET\n1.000 g\n0.000 G\n/GS1 gs\n0.567 w\n<...>'
# the problem is with b'\xac' which can't be decoded with utf-8
page.get_cdrawings()
print(stream)
new_doc = fitz.open()
new_page = new_doc.new_page(width=page.rect.width, height=page.rect.height)
# create a dummy drawing to overwrite with the failing one
shape = new_page.new_shape()
shape.draw_line((10, 10), (15, 15))
shape.finish()
shape.commit()
# overwrite the dummy drawing with the failing one
new_xref = new_page.get_contents()[0]
new_doc.update_stream(new_xref, stream, compress=True)
new_doc.save("new_doc.pdf")
Expected behavior (optional)
Since getting the drawings would pass for versions prior to 1.22.0
, I would expect it to pass for newer versions as well.
Screenshots (optional)
Not sure if that can help, but here is a cropped screenshot of the drawing stream bytes:

Your configuration (mandatory)
- Operating system, potentially version and bitness
- Python version, bitness
- PyMuPDF version, installation method (wheel or generated from source).
For example, the output of print(sys.version, "\n", sys.platform, "\n", fitz.__doc__)
would be sufficient (for the first two bullets).
3.9.13 (main, Sep 8 2022, 09:21:48)
[GCC 9.4.0]
linux
PyMuPDF 1.22.0: Python bindings for the MuPDF 1.22.0 library.
Version date: 2023-04-14 00:00:01.
Built for Python 3.9 on linux (64-bit).
Installed via pip install pymupdf==1.22.0