Skip to content

pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref error encountered after version 1.25.3 #4423

Closed
@bhaskarbharat

Description

@bhaskarbharat

Description of the bug

Hi Team. Based on some criteria, I have written a script that removed some text, overlapping images, and vector graphics from a PDF.
Two days ago, we upgraded the PyMuPDF version on our server from 1.25.3 to 1.25.4. Today, we received the following Error Exception raised corresponding to a PDF file:

[ERROR] 2025-04-01 14:58:43 Example.pdf - sample Traceback (most recent call last):
  File "/Users/user/Documents/sample.py", line 45, in sample_func
    doc.ez_save(dst_pdf)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 4223, in ez_save
    return self.save(
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 5584, in save
    mupdf.pdf_write_document(pdf, out, opts)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/mupdf.py", line 53942, in pdf_write_document
    return _mupdf.pdf_write_document(doc, out, opts)
pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref (21 0 R)

Today, I checked that a new version of PyMuPDF has released, that is, 1.25.5. I upgraded my server to that version to see if the error goes away, but it persisted. I also experimented with different save parameters and their values to see if the error resolves (an example below), but the same error persisted.

# doc.ez_save(dst_pdf)
doc.save(dst_pdf, garbage=4, clean=True, deflate=True, use_objstms=1)

I then downgraded the version to 1.25.3. The script execution resulted in the following error message but no Error Exception was raised and the file got saved successfully:

MuPDF error: format error: cannot find object in xref (21 0 R)

How to reproduce the bug

Sharing below my example script for your reference. I have changed the script to keep the broad logic same. The script does contain all the PyMuPDF methods that I have used in the original script.

import logging
from io import BytesIO
import fitz

logger = logging.getLogger(__file__)

TARGET_TEXT = "xyz"

def sample_func(src_pdf):
    if isinstance(src_pdf, BytesIO):
        # if input_pdf is a BytesIO object
        src_pdf.seek(0)
        doc = fitz.open(stream=src_pdf, filetype="pdf")
    elif isinstance(src_pdf, str):
        doc = fitz.open(src_pdf)

    for page_num in range(len(doc)):
        # Load the page
        page = doc.load_page(page_num)
        logger.info(f"page_num: {page_num + 1}")

        text_blocks = page.get_text("dict")["blocks"]
        for block in text_blocks:
            if block["type"] == 0:  # text block
                for line in block["lines"]:
                    for span in line["spans"]:                            
                        text_rect = fitz.Rect(span['bbox'])
                        logger.debug(f"span: {span}")
                                
                        # Extract text within the specified rectangle
                        text = page.get_text("text", clip=text_rect).strip()

                        if text == TARGET_TEXT:
                            # Create redaction annotation
                            redact_annot = page.add_redact_annot(text_rect)

                            # images=2 blanks out overlapping pixels
                            # graphics=2 removes any overlapping vector graphics
                            # text=0 removes all characters whose boundary box overlaps any redaction rectangle
                            page.apply_redactions(images=2, graphics=2, text=0)


    # Save the modified document
    dst_pdf = BytesIO()
    doc.ez_save(dst_pdf)
    
    doc.close()

    dst_pdf.seek(0)
    return dst_pdf.read()

Following is the Error Exception raised on running the above script:

[ERROR] 2025-04-01 14:58:43 Example.pdf - sample Traceback (most recent call last):
  File "/Users/user/Documents/sample.py", line 45, in sample_func
    doc.ez_save(dst_pdf)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 4223, in ez_save
    return self.save(
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/__init__.py", line 5584, in save
    mupdf.pdf_write_document(pdf, out, opts)
  File "/Users/user/miniconda3/envs/test4_env/lib/python3.10/site-packages/pymupdf/mupdf.py", line 53942, in pdf_write_document
    return _mupdf.pdf_write_document(doc, out, opts)
pymupdf.mupdf.FzErrorFormat: code=7: cannot find object in xref (21 0 R)

PyMuPDF version

1.25.5

Operating system

Linux

Python version

3.10

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions