-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading a pdf from file like object or data not working in python 3 with bytesIO #38
Comments
At a glance, I think my preferred solution would be to take the lines in
and change them to always execute
and then to modify the Python 3 (unicode not found) version of the
Since |
Thanks, I tried that solution and it seems to work fine and looks like a good solution. Only thing I would add is to |
The seek would certainly be convenient for some code, but I think I prefer not to have PdfReader do that for the simple reason that right now, code that needs the seek can do it before calling PdfReader, but if PdfReader always seeks to zero, then it will not be able to process a PDF embedded in a larger stream. Thanks for the bug report and testing the solution! |
Very happy to find this little conversation here, Cheers! |
Yes, that's why the issue is still open :-) |
I have tested it and it seems to work.If i can contribute in any way - please let me know ;-) -------- Ursprüngliche Nachricht -------- Yes, that's why the issue is still open :-) — |
Pull requests are good -- do you think you could add a test to the current test suite? It's not documented terribly well, but there is some getting-started stuff in the readme. Thanks, |
Ok! I ll start with the readme file! -------- Ursprüngliche Nachricht -------- Pull requests are good -- do you think you could add a test to the current test suite? It's not documented terribly well, but there is some getting-started stuff in the readme. — |
Make read of in-memory PDF work with 3.x - This closes issue #38 with code discussed there plus a regression test. - Also add Python version 3.5 to regression tests - Also add OSX filetypes to .gitignore
I have noticed that it is possible to make a
PdfReader
either by specifying a filename or file-like object, or by giving the data directly with fdata argument. This is great, however, it doesn't work if I give it aBytesIO
object since the various functions in the following code only work with strings. For example,fdata.startswith('%PDF-')
is called rather thanfdata.startswith(b'%PDF-')
.I can't immediately see an elegant way to solve this. Directly converting the data with str() produces assertion errors such as
'File "/usr/lib/python3.4/site-packages/pdfrw/pdfreader.py", line 319, in findxref assert tok == 'startxref' # (We just checked this...)'
with the files I have tried.The text was updated successfully, but these errors were encountered: