Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfWalkError(f"Cannot set {parent} as parent of {self}, parent is already {self.parent}") #18

Open
macdeport opened this issue Sep 30, 2024 · 2 comments

Comments

@macdeport
Copy link

When analyzing this PDF in version 1.15.1 I encounter the following exception:

PdfWalkError-240930.pdf.zip

Traceback (most recent call last):
  File "/Users/alain/Library/Python/3.11/bin/pdfalyze", line 8, in <module>
    sys.exit(pdfalyze())
             ^^^^^^^^^^
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/__init__.py", line 43, in pdfalyze
    pdfalyzer = Pdfalyzer(args.file_to_scan_path)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 57, in __init__
    self.walk_node(self.pdf_tree)
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 82, in walk_node
    self.walk_node(next_node)
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 82, in walk_node
    self.walk_node(next_node)
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 82, in walk_node
    self.walk_node(next_node)
  [Previous line repeated 7 more times]
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 77, in walk_node
    nodes_to_walk_next = [self._add_relationship_to_pdf_tree(r) for r in node.references_to_other_nodes()]
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 77, in <listcomp>
    nodes_to_walk_next = [self._add_relationship_to_pdf_tree(r) for r in node.references_to_other_nodes()]
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/pdfalyzer.py", line 133, in _add_relationship_to_pdf_tree
    from_node.set_parent(to_node)
  File "/Users/alain/Library/Python/3.11/lib/python/site-packages/pdfalyzer/decorators/pdf_tree_node.py", line 70, in set_parent
    raise PdfWalkError(f"Cannot set {parent} as parent of {self}, parent is already {self.parent}")
pdfalyzer.util.exceptions.PdfWalkError: Cannot set <7:StructElem:FACTURE_SIMM_NOUVEAU_FACO(Dictionary)> as parent of <98:StructElem:001_SIMM_FACTURE_VERSO(Dictionary)>, parent is already <8:K(Array)>
@michelcrypt4d4mus
Copy link
Owner

thanks for the bug report. StructElem as parent of a StructElem may be a data structure i've never encountered before. not sure i'll be able to fix this in the near future but if you want to take a look it's failing here.

@macdeport
Copy link
Author

macdeport commented Oct 1, 2024

@michelcrypt4d4mus

not sure i'll be able to fix this in the near future

This is also my current situation ;-)

I was making this bug report mainly to inform you and provide you with the corresponding file that may be useful when the time comes.

Additional information:

The exception occurs when accessing '/Title' metadata.

The dictionary of this metadata, which pikepdf lists as follows:

  {'/Author': 'Registered to: EDF     ',
   '/CreationDate': 'D:20240904235439Z',
   '/Creator': 'OpenText Exstream Version 16.4.0 64-bit',
   '/Title': 'FACTURE_SIMM_NOUVEAU_FACO'
  }

I've already found pdfalyzer useful and thank you for this excellent tool.

Repository owner deleted a comment Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants