Skip to content

page.get_image_info() returns outdated cached results after replacing image #4303

Closed
@stijnvermeeren-swisstopo

Description

Description of the bug

When calling page.get_image_info(xrefs=True), then applying page.replace_image() to one of the images on the page, and then calling page.get_image_info(xrefs=True) again on the same page object, the details of the old image (that has been replaced) are still returned, and the value of xref if 0.

The reason seems to be that the result of the page.get_image_info() call is cached on the Page object (

page._image_info = imginfo
), and this cache is not automatically cleared when calling page.replace_image().

I would expect the second call to page.get_image_info(xrefs=True) to return the details of the new image (including the correct xref), or at least for this caching behavior to be clearly described in the documentation.

How to reproduce the bug

import pymupdf

doc = pymupdf.Document()
page = doc.new_page()
image1x1 = pymupdf.Pixmap(pymupdf.Colorspace(pymupdf.CS_RGB), pymupdf.IRect(0, 0, 1, 1))
image2x2 = pymupdf.Pixmap(pymupdf.Colorspace(pymupdf.CS_RGB), pymupdf.IRect(0, 0, 2, 2))
xref = page.insert_image(page.rect, pixmap=image1x1)

print("Before replace_image")
for image_info in page.get_image_info(xrefs=True):
    print(image_info)
    # {..., 'width': 1, 'height': 1, ..., 'size': 59, 'digest': b'&9-[t\ro\x84y\xba\x80\xfeA?\xb3\xe9', ..., 'xref': 5}

page.replace_image(xref, pixmap=image2x2)

print("After replace_image, same page object")
for image_info in page.get_image_info(xrefs=True):
    print(image_info)
    # {..., 'width': 1, 'height': 1, ..., 'size': 59, 'digest': b'&9-[t\ro\x84y\xba\x80\xfeA?\xb3\xe9', ..., 'xref': 0}

print("After replace_image, new page object")
for image_info in doc[page.number].get_image_info(xrefs=True):
    print(image_info)
    # {..., 'width': 2, 'height': 2, ..., 'size': 68, 'digest': b"'\x87\x15P,\xd9\x01ze\x08\x01\xa4\x1e\xda\xd9\xce", ..., 'xref': 7}

PyMuPDF version

1.25.3

Operating system

Linux

Python version

3.10

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions