Skip to content

KeyError: '/Root' due to invalid start of xref #1756

Closed
@owurman

Description

@owurman

I was trying to get the pages for the attached PDF but received a KeyError: '/Root'. The file appears to be encrypted to me, but pdf.is_encrypted is False.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
macOS-10.16-x86_64-i386-64bit

$ python -c "import pypdf;print(pypdf.__version__)"
3.7.0

Code + PDF

import pypdf

reader = pypdf.PdfReader("641-Attachment-B-Pediatric-Cardiac-Arrest-8-1-2019.pdf")
assert (not reader.is_encrypted)
len(reader.pages)

Share here the PDF file(s) that cause the issue. The smaller they are, the
better. Let us know if we may add them to our tests!
641-Attachment-B-Pediatric-Cardiac-Arrest-8-1-2019.pdf

It's a public document so it should be fine to add to your tests.

Traceback

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pypdf/_page.py", line 2155, in __len__
    return self.length_function()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "pypdf/_reader.py", line 452, in _get_num_pages
    self._flatten()
  File "pypdf/_reader.py", line 1186, in _flatten
    catalog = self.trailer[TK.ROOT].get_object()
              ~~~~~~~~~~~~^^^^^^^^^
  File "pypdf/generic/_data_structures.py", line 291, in __getitem__
    return dict.__getitem__(self, key).get_object()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: '/Root'

Metadata

Metadata

Assignees

No one assigned

    Labels

    is-robustness-issueFrom a users perspective, this is about robustnesskey-errorCould be a bug, but also a robustness issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions