We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
See #1269 for further details, this reports another issue I've come accross.
Which environment were you using when you encountered the problem?
$ python -m platform Linux-5.4.0-122-generic-x86_64-with-glibc2.29 $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.10.3
This is a minimal, complete example that shows the issue:
import PyPDF2 with open("Segmentation & Activation Lab.pdf", "rb") as f: pdfreader = PyPDF2.PdfFileReader(f, strict=False) full_content = " ".join([page.extractText() for page in pdfreader.pages])
PDF used above: Segmentation & Activation Lab.pdf
This is the complete Traceback I see:
Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1538, in extractText return self.extract_text() File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1510, in extract_text return self._extract_text( File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_page.py", line 1146, in _extract_text cmaps[f] = build_char_map(f, space_width, obj) File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_cmap.py", line 21, in build_char_map encoding, space_code = parse_encoding(ft, space_code) File "/usr/local/lib/python3.8/dist-packages/PyPDF2/_cmap.py", line 124, in parse_encoding enc: Union(str, DictionaryObject) = ft["/Encoding"].get_object() # type: ignore AttributeError: 'NoneType' object has no attribute 'get_object'
The PDF can be read using a normal PDF viewer and the PDF even comes from Adobe.
Another example:
The text was updated successfully, but these errors were encountered:
This file uses a /XRefStm entry that seems not be be processed currently. Work in progress
Sorry, something went wrong.
ENH : Process XRefStm
2dc76c0
fixes py-pdf#1295 includes test file adjustment
1252a49
Successfully merging a pull request may close this issue.
See #1269 for further details, this reports another issue I've come accross.
Environment
Which environment were you using when you encountered the problem?
$ python -m platform Linux-5.4.0-122-generic-x86_64-with-glibc2.29 $ python -c "import PyPDF2;print(PyPDF2.__version__)" 2.10.3
Code + PDF
This is a minimal, complete example that shows the issue:
PDF used above: Segmentation & Activation Lab.pdf
Traceback
This is the complete Traceback I see:
The PDF can be read using a normal PDF viewer and the PDF even comes from Adobe.
Another example:
The text was updated successfully, but these errors were encountered: