-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Closed
Copy link
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affected
Description
Environment
$ python -m platform
Linux-6.8.0-100039-tuxedo-x86_64-with-glibc2.35
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.3.1, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=10.3.0
The version effectively is the latest main code.
Code + PDF
This is a minimal, complete example that shows the issue:
>>> from pypdf import PdfWriter
>>> writer = PdfWriter(clone_from='ISO_32000-2-2020_sponsored.pdf')
Using PdfReader
and iterating over the pages extracting the text does not fail.
I cannot share the document (1003 pages) here as it is the non-public copy of the PDF 2.0 specification available for free on https://pdfa.org/sponsored-standards/
Traceback
This is the complete traceback I see:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_writer.py", line 233, in __init__
self.clone_document_from_reader(clone_from)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_writer.py", line 1150, in clone_document_from_reader
self.clone_reader_document_root(reader)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_writer.py", line 1119, in clone_reader_document_root
self._root_object = reader.root_object.clone(self)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 258, in clone
d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 369, in _clone
v.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 274, in clone
obj.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 258, in clone
d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 369, in _clone
v.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 274, in clone
obj.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 258, in clone
d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 369, in _clone
v.clone(pdf_dest, force_duplicate, ignore_fields)
[...]
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 129, in clone
arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 274, in clone
obj.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 258, in clone
d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 369, in _clone
v.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 129, in clone
arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 274, in clone
obj.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 258, in clone
d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 369, in _clone
v.clone(pdf_dest, force_duplicate, ignore_fields)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 129, in clone
arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 266, in clone
obj = self.get_object()
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 286, in get_object
return self.pdf.get_object(self)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_reader.py", line 381, in get_object
retval = self._get_object_from_stream(indirect_reference) # type: ignore
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_reader.py", line 315, in _get_object_from_stream
obj_stm: EncodedStreamObject = IndirectObject(stmnum, 0, self).get_object() # type: ignore
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_base.py", line 286, in get_object
return self.pdf.get_object(self)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/_reader.py", line 442, in get_object
retval = read_object(self.stream, self) # type: ignore
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 1305, in read_object
return DictionaryObject.read_from_stream(stream, pdf, forced_encoding)
File "/home/stefan/tmp/pypdf/pypdf_upstream/pypdf/generic/_data_structures.py", line 562, in read_from_stream
if isinstance(length, IndirectObject):
File "/usr/lib/python3.10/typing.py", line 1503, in __instancecheck__
issubclass(instance.__class__, cls)):
RecursionError: maximum recursion depth exceeded
Metadata
Metadata
Assignees
Labels
PdfWriterThe PdfWriter component is affectedThe PdfWriter component is affected