Skip to content

Unable to find 'endstream' marker for obj starting at 13367. #2326

@shafeeralip

Description

@shafeeralip

Hello,

When executing this piece of code:

from pypdf import PdfReader,PdfWriter
import traceback

try:
    input_pdf = PdfReader(dwnld_filepath)
    output_pdf = PdfWriter()
    image = input_pdf.pages[0]
    output_pdf.add_page(image)
    output_pdf.write(file_path)
except Exception as e:
    traceback.print_exc()

here the PDF file(s) that cause the issue.

EveBest.pdf

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "/Users/shafeerali/Documents/Nanonets/avanto/API/test.py", line 58, in <module>
    output_pdf.add_page(image)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/_writer.py", line 418, in add_page
    return self._add_page(page, list.append, excluded_keys)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/_writer.py", line 331, in _add_page
    page = cast("PageObject", page_org.clone(self, False, excluded_keys))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 199, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 310, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 199, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 310, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 199, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields, visited)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 310, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_base.py", line 300, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 116, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_base.py", line 292, in clone
    obj = self.get_object()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_base.py", line 312, in get_object
    obj = self.pdf.get_object(self)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/_reader.py", line 1401, in get_object
    retval = read_object(self.stream, self)  # type: ignore
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 1280, in read_object
    return DictionaryObject.read_from_stream(stream, pdf, forced_encoding)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 538, in read_from_stream
    data["__streamdata__"] = read_unsized_from_steam(stream, pdf)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pypdf/generic/_data_structures.py", line 432, in read_unsized_from_steam
    raise PdfReadError(
pypdf.errors.PdfReadError: Unable to find 'endstream' marker for obj starting at 13367.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-robustness-issueFrom a users perspective, this is about robustness

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions