Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for signature verifications. #305

Open
lvella opened this issue Aug 13, 2024 · 2 comments
Open

Better support for signature verifications. #305

lvella opened this issue Aug 13, 2024 · 2 comments

Comments

@lvella
Copy link

lvella commented Aug 13, 2024

This is what I am trying to do:

Alice creates a PDF via some mechanism (e.g. export to PDF from LibreOffice), then send it to Bob. Bob signs it and send it back. Alice verifies Bob's signature and now knows Bob has agreed to the terms of the document.

I am trying writing the verification program, but there are a number of problems that are hard to solve with lopdf. They are:

  • It is hard to know that what Bob has signed is actually what Alice sent. It is not enough to verify that the signed file includes the original file as prefix, because Bob might have added and incremental update with some PDF elements that changes how the text in the document is rendered, then signed it. The solution I came up with was to whitelist what can be included with an incremental update in order to Alice to accept the signature as valid. Only inclusion of the minimal amount of required Objects is allowed, no Object removals or alterations is allowed, and have whitelisted positions for the visible annotation Rect, not overlapping with other signatures that have been similarly validated. It would help greatly with this operation if there was an easy way to compare the objects modifications across different incremental updates of the same document.
  • It is hard to tell what region is actually being signed. The signature contains /ByteRange into the file bytes, but there is no easy way to tell if only the /Contents of that signature itself was excluded, as required by conforming validators.
  • It is hard to tell the final byte position of each incremental update. In case the file has multiple signatures, it must include everything from byte 0 to the %EOF following the signature itself.

It would be great if lopdf had support for these cases.

@Heinenen
Copy link
Collaborator

Disclaimer: I am no expert in signatures, nor incremental updates. I am also not saying that the situation in lopdf can't be improved upon.

The most relevant parts of the PDF1.7 spec on this are probably "12.8 Digital signatures", "7.5.6 Incremental Updates", and "H.7 Updating Example".

The way I understand what needs to be done to verify the signatures:

  • Bob will use an incremental update to append an "approval signature". Its byte range includes everything up to the second %%EOF.
  • Bob sends the signed PDF back to Alice.
  • Alice verifies that the PDF wasn't modified, except for the incremental update: (pseudocode) signed_pdf.bytes[0..original_pdf.len()] == original_pdf.bytes[0..]
  • Alice reads in the signed PDF with (something similar to) IncrementalDocument::load.
  • Alice has access to the incremental xref table (if I understand the code correctly, this part currently not possible with lopdf)
  • From this, it is relatively easy to check which objects changed, cf. "7.5.6 Incremental Updates" and "H.7 Updating Example". (your first point)
  • Alice checks that the byte ranges are correct (your second and third points)
  • Alice verifies the signature with Bob's public key

Your third point can probably be addressed together with implementing the possibility to access the different xref tables.

Your second point should be pretty easy to do without support from lopdf: there should always be exactly two byte ranges, one from the start of the file to the start of the signature, and on from the end of the signature to the end of the file (or rather, to the correct EOF marker, but that's already addressed in your third point). The thing in between must be equal to the signature that is extracted by lopdf.


As I said, I'm not an expert in this topic, so please correct me if I missed something/made a mistake.

@lvella
Copy link
Author

lvella commented Sep 21, 2024

I wrote a library using lopdf to try to handle this scenario: https://github.com/lvella/trust_pdf

I think the only issue that would be better if handled by lopdf itself is if I could access all the previous xref tables in a document. Currently I have to re-parse the file up to the previous incremental update I am analyzing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants