Skip to content

Commit

Permalink
TST: Use external repository for larger/more PDFs for testing (#820)
Browse files Browse the repository at this point in the history
* Use submodule so that the connection is clear. Ensure that Flake8 issues of the submodule don't show up here
* As a first step, just try to get the number of pages from the non-encrypted PDFs
* Create an "external" pytest marker which allows people to deactivate tests that need the submodule
  • Loading branch information
MartinThoma authored Apr 26, 2022
1 parent 39ffc1d commit 963b251
Show file tree
Hide file tree
Showing 6 changed files with 51 additions and 2 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/github-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ jobs:
steps:
- name: Checkout Code
uses: actions/checkout@v3
with:
submodules: 'recursive'
- name: Setup Python
uses: actions/setup-python@v3
with:
Expand All @@ -46,7 +48,7 @@ jobs:
pip install .
- name: Test with flake8
run: |
flake8 . --ignore=E203,W503,W504,E,F403,F405 --exclude build
flake8 . --ignore=E203,W503,W504,E,F403,F405 --exclude build,sample-files
if: matrix.python-version != '2.7'
- name: Test with pytest
run: |
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "sample-files"]
path = sample-files
url = https://github.com/py-pdf/sample-files
25 changes: 25 additions & 0 deletions Tests/test_page.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import json

import pytest

Expand All @@ -7,6 +8,30 @@
TESTS_ROOT = os.path.abspath(os.path.dirname(__file__))
PROJECT_ROOT = os.path.dirname(TESTS_ROOT)
RESOURCE_ROOT = os.path.join(PROJECT_ROOT, "Resources")
EXTERNAL_ROOT = os.path.join(PROJECT_ROOT, "sample-files")


def get_all_sample_files():
with open(os.path.join(EXTERNAL_ROOT, "files.json")) as fp:
data = fp.read()
meta = json.loads(data)
return meta


all_files_meta = get_all_sample_files()


@pytest.mark.external
@pytest.mark.parametrize(
"meta",
[m for m in all_files_meta["data"] if not m["encrypted"]],
ids=[m["path"] for m in all_files_meta["data"] if not m["encrypted"]],
)
def test_read(meta):
pdf_path = os.path.join(EXTERNAL_ROOT, meta["path"])
reader = PdfFileReader(pdf_path)
reader.pages[0]
assert len(reader.pages) == meta["pages"]


@pytest.mark.parametrize(
Expand Down
17 changes: 17 additions & 0 deletions docs/dev/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,23 @@ pip install -r requirements/dev.txt
pytest .
```

We have the following pytest markers defined:

* `no_py27`: Flag for tests that fail under Python 2.7 only
* `external`: Tests which use files from [the `sample-files` git submodule](https://github.com/py-pdf/sample-files)

You can locally choose not to run those via `pytest -m "not external"`.

## The sample-files git submodule
The reason for having the submodule `sample-files` is that we want to keep
the size of the PyPDF2 repository small while we also want to have an extensive
test suite. Those two goals contradict each other.

The `Resources` folder should contain a select set of core examples that cover
most cases we typically want to test for. The `sample-files` might cover a lot
more edge cases, the behavior we get when file sizes get bigger, different
PDF producers.

## Tools: git and pre-commit

Git is a command line application for version control. If you don't know it,
Expand Down
3 changes: 2 additions & 1 deletion pytest.ini
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
[pytest]
markers =
no_py27: Flag for tests that fail under Python 2.7 only
no_py27: Flag for tests that fail under Python 2.7 only
external: Tests which use files from https://github.com/py-pdf/sample-files
1 change: 1 addition & 0 deletions sample-files
Submodule sample-files added at 6e3a1b

0 comments on commit 963b251

Please sign in to comment.