Skip to content

Invalid label retrieval when /Kids is an array of multiple /Nums #4341

Closed
@naquad

Description

@naquad

Description of the bug

Page.get_label fails with the following error:

Traceback (most recent call last):
  File "/home/naquad/bin/./pdfannots.py", line 445, in <module>
    main()
    ~~~~^^
  File "/home/naquad/bin/./pdfannots.py", line 442, in main
    extract_annotations_to_html(TEMPLATES[args.template], args.pdf_path, output_path, args.pages)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/naquad/bin/./pdfannots.py", line 316, in extract_annotations_to_html
    "title": page.get_label(),
             ~~~~~~~~~~~~~~^^
  File "/home/naquad/.local/lib/python3.13/site-packages/pymupdf/utils.py", line 5297, in get_label
    labels = page.parent._get_page_labels()
  File "/home/naquad/.local/lib/python3.13/site-packages/pymupdf/__init__.py", line 3420, in _get_page_labels
    mupdf.pdf_dict_get( mupdf.pdf_array_get(kids, i)),
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: pdf_dict_get() missing 1 required positional argument: 'key'

I've traced the issue to this invocation: https://github.com/pymupdf/PyMuPDF/blob/main/src/__init__.py#L3419

            nums = mupdf.pdf_resolve_indirect(
                    mupdf.pdf_dict_get( mupdf.pdf_array_get(kids, i)),
                    PDF_NAME('Nums'),
                    )

Mind where the PDF_NAME('Nums') is passed. Instead of passing it as the second argument to pdf_dict_get it is passed to the pdf_resolve_indirect.

How to reproduce the bug

Find a PDF where /Kids is an array of multiple /Nums and try to get the page label.

P. S. Sorry, can't share the PDF :(

PyMuPDF version

1.25.3

Operating system

Linux

Python version

3.13

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions