Skip to content

Commit

Permalink
DOC: Working with annotations (#764)
Browse files Browse the repository at this point in the history
See #107
  • Loading branch information
MartinThoma authored Apr 16, 2022
1 parent cf20f92 commit 87aafd6
Show file tree
Hide file tree
Showing 3 changed files with 93 additions and 7 deletions.
17 changes: 10 additions & 7 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,8 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
user/merging-pdfs
user/cropping-and-transforming
user/add-watermark


.. toctree::
:caption: Scripts
:maxdepth: 1

user/pdfcat
user/reading-pdf-annotations
user/adding-pdf-annotations


.. toctree::
Expand Down Expand Up @@ -59,6 +54,14 @@ You can contribute to `PyPDF2 on Github <https://github.com/py-pdf/PyPDF2>`_.
user/faq


.. toctree::
:caption: Scripts
:maxdepth: 1

user/pdfcat




Indices and tables
==================
Expand Down
16 changes: 16 additions & 0 deletions docs/user/adding-pdf-annotations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Adding PDF Annotations

## Attachments

```python
from PyPDF2 import PdfFileWriter

writer = PdfFileWriter()
writer.addBlankPage(width=200, height=200)

data = b"any bytes - typically read from a file"
writer.addAttachment("smile.png", data)

with open("output.pdf", "wb") as output_stream:
writer.write(output_stream)
```
67 changes: 67 additions & 0 deletions docs/user/reading-pdf-annotations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Reading PDF Annotations

PDF 1.7 defines 25 different annotation types:

* Text
* Link
* FreeText
* Line, Square, Circle, Polygon, PolyLine, Highlight, Underline, Squiggly, StrikeOut
* Stamp, Caret, Ink
* Popup
* FileAttachment
* Sound, Movie
* Widget, Screen
* PrinterMark
* TrapNet
* Watermark
* 3D

Reading the most common ones is described here.

## Text

```python
from PyPDF2 import PdfFileReader

reader = PdfFileReader("example.pdf")

for page in reader.pages:
if "/Annots" in page:
for annot in page["/Annots"]:
subtype = annot.getObject()["/Subtype"]
if subtype == "/Text":
print(annot.getObject()["/Contents"])
```

## Highlights

```python
from PyPDF2 import PdfFileReader

reader = PdfFileReader("commented.pdf")

for page in reader.pages:
if "/Annots" in page:
for annot in page["/Annots"]:
subtype = annot.getObject()["/Subtype"]
if subtype == "/Highlight":
coords = annot.getObject()["/QuadPoints"]
x1, y1, x2, y2, x3, y3, x4, y4 = coords
```

## Attachments

```python
from PyPDF2 import PdfFileReader

reader = PdfFileReader("example.pdf")

attachments = {}
for page in reader.pages:
if "/Annots" in page:
for annotation in page["/Annots"]:
subtype = annot.getObject()["/Subtype"]
if subtype == "/FileAttachment":
fileobj = annotobj["/FS"]
attachments[fileobj["/F"]] = fileobj["/EF"]["/F"].getData()
```

0 comments on commit 87aafd6

Please sign in to comment.