-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Errors occur in the PDF writing process.
Environment
Which environment were you using when you encountered the problem?
> python -m platform
Windows-10-10.0.22631-SP0
> python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.3.1, crypt_provider=('cryptography', '41.0.5'), PIL=10.4.0
Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfWriter, PdfReader
def test_write_pdf():
filepath = r"C:\test.pdf"
with open(filepath, "rb") as f:
pdf_writer = PdfWriter()
pdf_reader = PdfReader(f, True)
print(pdf_reader.metadata)
print(pdf_reader.named_destinations)
pdf_writer.append(pdf_reader)
Sorry we are unable to provide the PDF.
We are in the process of confirming that we can create a PDF that can be published without any problems.
Traceback
This is the complete traceback I see:
venv\venv\Lib\site-packages\pypdf\_writer.py:2365: in append
self.merge(
venv\venv\Lib\site-packages\pypdf\_writer.py:2474: in merge
p = reader.pages[dest["/Page"]]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pypdf._page._VirtualList object at 0x0000025E7C033650>, index = 1
def __getitem__(
self, index: Union[int, slice]
) -> Union[PageObject, Sequence[PageObject]]:
if isinstance(index, slice):
indices = range(*index.indices(len(self)))
cls = type(self)
return cls(indices.__len__, lambda idx: self[indices[idx]])
if not isinstance(index, int):
raise TypeError("sequence indices must be integers")
len_self = len(self)
if index < 0:
# support negative indexes
index = len_self + index
if index < 0 or index >= len_self:
> raise IndexError("sequence index out of range")
E IndexError: sequence index out of range
Perhaps the following are causing the problem. 46 0 obj
and 20 0 obj
do not exist.
I tried to fix them in Adobe Acrobat, but could not figure out how to turn them off.
30 0 obj
<</AcroForm 46 0 R/Dests 20 0 R/Extensions<</ADBE<</BaseVersion/1.7/ExtensionLevel 8>>>>/Metadata 5 0 R/Names 47 0 R/OCProperties<</D<</OFF[]/Order[]/RBGroups[]>>/OCGs[48 0 R 49 0 R 50 0 R]>>/Pages 18 0 R/StructTreeRoot 14 0 R/Type/Catalog>>
endobj
When read in PdfReader, the following will be generated in named_destinations
. (pdf has only one page)
{'/__WKANCHOR_2': {'/Title': '/__WKANCHOR_2', '/Page': 0, '/Type': '/XYZ', '/Left': 36, '/Top': 754, '/Zoom': 0.0}, '/__WKANCHOR_4': {'/Title': '/__WKANCHOR_4', '/Page': 0, '/Type': '/XYZ', '/Left': 305, '/Top': 754, '/Zoom': 0.0}, '/__WKANCHOR_6': {'/Title': '/__WKANCHOR_6', '/Page': 0, '/Type': '/XYZ', '/Left': 36, '/Top': 454, '/Zoom': 0.0}, '/__WKANCHOR_8': {'/Title': '/__WKANCHOR_8', '/Page': 1, '/Type': '/XYZ', '/Left': 61, '/Top': 802, '/Zoom': 0.0}, '/__WKANCHOR_a': {'/Title': '/__WKANCHOR_a', '/Page': 1, '/Type': '/XYZ', '/Left': 36, '/Top': 425, '/Zoom': 0.0}, '/__WKANCHOR_c': {'/Title': '/__WKANCHOR_c', '/Page': 2, '/Type': '/XYZ', '/Left': 36, '/Top': 814, '/Zoom': 0.0}, '/__WKANCHOR_e': {'/Title': '/__WKANCHOR_e', '/Page': 2, '/Type': '/XYZ', '/Left': 36, '/Top': 703, '/Zoom': 0.0}}
I was able to avoid the error by adding if len(reader.pages) > dest[“/Page”]:
on the PdfWriter side.
Lines 2471 to 2482 in 8f62120
elif isinstance(dest["/Page"], int): | |
# the page reference is a page number normally not a PDF Reference | |
# page numbers as int are normally accepted only in external goto | |
p = reader.pages[dest["/Page"]] | |
assert p.indirect_reference is not None | |
try: | |
arr[NumberObject(0)] = NumberObject( | |
srcpages[p.indirect_reference.idnum].page_number | |
) | |
self.add_named_destination_array(dest["/Title"], arr) | |
except KeyError: | |
pass |
elif isinstance(dest["/Page"], int):
# the page reference is a page number normally not a PDF Reference
# page numbers as int are normally accepted only in external goto
if len(reader.pages) > dest["/Page"]:
p = reader.pages[dest["/Page"]]
assert p.indirect_reference is not None
try:
arr[NumberObject(0)] = NumberObject(
srcpages[p.indirect_reference.idnum].page_number
)
self.add_named_destination_array(dest["/Title"], arr)
except KeyError:
pass
A Dests reference is created in the resulting PDF as follows.
15 0 obj
<<
/Dests 16 0 R
>>
endobj
16 0 obj
<<
/Names [ (\057\137\137WKANCHOR\1372) [ 0 /XYZ 36 754 0.0 ] (\057\137\137WKANCHOR\1374) [ 0 /XYZ 305 754 0.0 ] (\057\137\137WKANCHOR\1376) [ 0 /XYZ 36 454 0.0 ] ]
>>