Open
Description
Description of the bug
When the attached file is processed, calling page.get_links()
leads to an IndexError
for page 14.
How to reproduce the bug
This was traced down to the following lines in src/__init__.py
:
for i, v in enumerate(array.replace("null", "0").split()[1:]):
t[i] = float(v)
For page 14 the array
variable contains
/XYZ 116.00001 745.92 0 34 0 R/XYZ 116.00001 745.92 0 40 0 R/XYZ 116.00001 745.92 0 47 0 R/XYZ 116.00001 745.92 0 56 0 R/XYZ 116.00001 745.92 0 64 0 R/XYZ 116.00001 745.92 0
leading to the following array being enumerated in the loop:
['116.00001', '745.92', '0', '34', '0', 'R/XYZ', '116.00001', '745.92', '0', '40', '0', 'R/XYZ', '116.00001', '745.92', '0', '47', '0', 'R/XYZ', '116.00001', '745.92', '0', '56', '0', 'R/XYZ', '116.00001', '745.92', '0', '64', '0', 'R/XYZ', '116.00001', '745.92', '0']
which leads to the IndexError.
PyMuPDF version
Built from source
Operating system
Linux
Python version
3.12