From 724a90246b7c99cba1b33ea95f03caaecb74c42a Mon Sep 17 00:00:00 2001 From: Martin Thoma Date: Fri, 22 Apr 2022 08:01:05 +0200 Subject: [PATCH] DOC: Add compression example --- docs/index.rst | 1 + docs/user/comparisons.md | 7 ++++++- docs/user/file-size.md | 38 ++++++++++++++++++++++++++++++++++++++ docs/user/robustness.md | 3 ++- 4 files changed, 47 insertions(+), 2 deletions(-) create mode 100644 docs/user/file-size.md diff --git a/docs/index.rst b/docs/index.rst index f92aa5ef4..bbb332b06 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -30,6 +30,7 @@ You can contribute to `PyPDF2 on Github `_. user/adding-pdf-annotations user/forms user/streaming-data + user/file-size .. toctree:: diff --git a/docs/user/comparisons.md b/docs/user/comparisons.md index a75a7be5d..a4e6748da 100644 --- a/docs/user/comparisons.md +++ b/docs/user/comparisons.md @@ -63,4 +63,9 @@ a large community behind it. And there are more: -* [`pyfpdf`](https://github.com/reingart/pyfpdf) +* [`pdfplumber`](https://pypi.org/project/pdfplumber/) + +## Document Generation + +There are (Python) [tools to generate PDF documents](https://github.com/py-pdf/awesome-pdf#generators). +PyPDF2 is not one of them. diff --git a/docs/user/file-size.md b/docs/user/file-size.md new file mode 100644 index 000000000..2bcb942f4 --- /dev/null +++ b/docs/user/file-size.md @@ -0,0 +1,38 @@ +# Reduce PDF Size + +There are multiple ways to reduce the size of a given PDF file. The easiest +one is to remove content (e.g. images) or pages. + +## Remove images + + +```python +import PyPDF2 + +reader = PyPDF2.PdfFileReader("example.pdf") +writer = PyPDF2.PdfFileWriter() + +for page in reader.pages: + writer.addPage(page) + +writer.removeImages() + +with open("out.pdf", "wb") as f: + writer.write(f) +``` + +## Compression + +```python +import PyPDF2 + +reader = PyPDF2.PdfFileReader("example.pdf") +writer = PyPDF2.PdfFileWriter() + +for page in reader.pages: + page.compressContentStreams() + writer.addPage(page) + +with open("out.pdf", "wb") as f: + writer.write(f) +``` diff --git a/docs/user/robustness.md b/docs/user/robustness.md index a516d70af..89e780407 100644 --- a/docs/user/robustness.md +++ b/docs/user/robustness.md @@ -37,4 +37,5 @@ Choosing `strict=True` means that PyPDF2 will raise an exception if a PDF does not follow the specification. Choosing `strict=False` means that PyPDF2 will try to be forgiving and do -something reasonable, but it will log a warning message. +something reasonable, but it will log a warning message. It is a best-effort +approach.