-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Use typing.IO for file streams #1498
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Many developers (like myself) like to use virtual environments included within the current project. These virtual environment are local development constructs and should not be checked into source control. This commit adds two common virtual environment directory names to the .gitignore to avoid accidental commits from future developers.
The current contribution instructions in `docs/dev/intro.md` direct new code contributors to install the `dev` requirements. After following that instruction, the minimal test suite fails with the following errors: ``` python -m venv .venv source .venv/bin/activate pip install -r requirements/dev.txt pytest -m "not external" -m "not samples" -m "not slow" ``` =================================================================================================== short test summary info ==================================================================================================== FAILED tests/test_reader.py::test_get_images[pdflatex-outline.pdf-expected_images0] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[crazyones.pdf-expected_images1] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[git.pdf-expected_images2] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[imagemagick-CCITTFaxDecode.pdf-expected_images5] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_reader.py::test_get_images[src6-expected_images6] - ModuleNotFoundError: No module named 'PIL' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/994/994636.pdf-tika-994636.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952133.pdf-tika-952133.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/914/914568.pdf-tika-914568.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/952/952016.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/965/965118.pdf-tika-952016.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/959/959184.pdf-tika-959184.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/958/958496.pdf-tika-958496.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972174.pdf-tika-972174.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/972/972243.pdf-tika-972243.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://corpora.tika.apache.org/base/docs/govdocs1/969/969502.pdf-tika-969502.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction[https://arxiv.org/pdf/2201.00214.pdf-arxiv-2201.00214.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction_strict - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' FAILED tests/test_workflows.py::test_image_extraction2[https://corpora.tika.apache.org/base/docs/govdocs1/977/977609.pdf-tika-977609.pdf] - ImportError: pillow is required to do image extraction. It can be installed via 'pip install PyPDF2[image]' ======================================================================= 18 failed, 536 passed, 5 skipped, 53 deselected, 5 xfailed in 146.94s (0:02:26) ======================================================================== This commit adds `pillow` to `requirements/dev.in` so that the minimal test suite can pass on the first try so that new code contributors can start implementing improvements with confidence.
The Python standard library provides the `IO` type for file streams. (Source: https://docs.python.org/3/library/typing.html#typing.IO) This commit replaces the complex Union type of the `IO` implementations with the official `IO` type. This will improve the accuracy of type checking in users' IDEs.
The CI system flagged some additional conflicts with the `IO` type in the writer classes. This commit changes the writer classes to use the standard `IO` type instead of the union of IO implementations.
IO
type for file streams
Thank you for the contribution! This time it might take a little bit longer until the release happens as I want to make the 3.0.0 release :-) |
If you want, I'll add you to the list of contributors: contributors — PyPDF2 documentation |
Go for it. :D Thanks! |
Done 🤗 |
MartinThoma
added a commit
that referenced
this pull request
Dec 22, 2022
BREAKING CHANGES: - Deprecate features with PyPDF2==3.0.0 (#1489) - Refactor Fit / Zoom parameters (#1437) New Features (ENH): - Add Cloning (#1371) - Allow int for indirect_reference in PdfWriter.get_object (#1490) Documentation (DOC): - How to read PDFs from S3 (#1509) - Make MyST parse all links as simple hyperlinks (#1506) - Changed 'latest' for 'stable' generated docs (#1495) - Adjust deprecation procedure (#1487) Maintenance (MAINT): - Use typing.IO for file streams (#1498) [Full Changelog](2.12.1...3.0.0)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The Python standard library provides the
IO
type for file streams. [source]This commit replaces the complex Union type of the various IO implementations with the official
IO
type. This will improve the accuracy of type checking in users' IDEs.There are also a few quality-of-life improvements for new code contributors like myself.