You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really enjoy what you've built. Thank you very much. I'm reporting an issue I've run into when attempting to copy more than one PDF to the watch folder while using a config.yaml file. Here is the complete output from my command line, including the command. Also, I have imagemagick installed. Even though the message says it can't find "identify", I can run "identify" from the command line without any issues. Also, if I don't use a watch folder, I can run pypdfocr on a single file without any issue.
Starting to watch for new pdfs in C:\Users\tringger006\Documents\ARCA\2014\OCR
Starting conversion of C:\Users\tringger006\Documents\ARCA\2014\OCR\Document1.pdf
WARNING: Could not execute identify to calculate DPI (try installing imagemagick?), so defaulting to >300dpi
Completed conversion successfully to >C:\Users\tringger006\Documents\ARCA\2014\OCR\Document1_ocr.pdf
Traceback (most recent call last):
File "C:\Python27\Scripts\pypdfocr-script.py", line 9, in
load_entry_point('pypdfocr==0.7.4', 'console_scripts', 'pypdfocr')()
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 411, in main
script.go(sys.argv[1:])
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 396, in go
filing = self.file_converted_file(ocr_pdffilename, pdf_filename)
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 331, in file_converted_file
filed_path = self.pdf_filer.move_to_matching_folder(ocr_pdffilename)
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr_pdffiler.py", line 65, in move_to_matching_folder
for page_text in self.iter_pdf_page_text(filename):
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr_pdffiler.py", line 42, in iter_pdf_page_text
reader = PdfFileReader(filename)
File "c:\Python27\lib\site-packages\PyPDF2\pdf.py", line 800, in init
self.read(stream)
File "c:\Python27\lib\site-packages\PyPDF2\pdf.py", line 1242, in read
stream.seek(-1, 2)
AttributeError: 'unicode' object has no attribute 'seek'
Here are the contents of my config.yaml file, stored in the watch directory:
Hi Virantha,
I really enjoy what you've built. Thank you very much. I'm reporting an issue I've run into when attempting to copy more than one PDF to the watch folder while using a config.yaml file. Here is the complete output from my command line, including the command. Also, I have imagemagick installed. Even though the message says it can't find "identify", I can run "identify" from the command line without any issues. Also, if I don't use a watch folder, I can run pypdfocr on a single file without any issue.
Here are the contents of my config.yaml file, stored in the watch directory:
Thanks so much, again.
The text was updated successfully, but these errors were encountered: