AttributeError: 'unicode' object has no attribute 'seek' #14

tringger · 2014-04-22T12:24:47Z

Hi Virantha,

I really enjoy what you've built. Thank you very much. I'm reporting an issue I've run into when attempting to copy more than one PDF to the watch folder while using a config.yaml file. Here is the complete output from my command line, including the command. Also, I have imagemagick installed. Even though the message says it can't find "identify", I can run "identify" from the command line without any issues. Also, if I don't use a watch folder, I can run pypdfocr on a single file without any issue.

C:\Users\tringger006\Documents\ARCA\2014\OCR>pypdfocr -w C:\Users\tringger006\Documents\ARCA\2014\OCR -f -c config.yaml -d

Filing of PDFs is enabled

2 target filing folders

2 keywords

Starting to watch for new pdfs in C:\Users\tringger006\Documents\ARCA\2014\OCR
Starting conversion of C:\Users\tringger006\Documents\ARCA\2014\OCR\Document1.pdf

WARNING: Could not execute identify to calculate DPI (try installing imagemagick?), so defaulting to >300dpi
Completed conversion successfully to >C:\Users\tringger006\Documents\ARCA\2014\OCR\Document1_ocr.pdf
Traceback (most recent call last):
File "C:\Python27\Scripts\pypdfocr-script.py", line 9, in
load_entry_point('pypdfocr==0.7.4', 'console_scripts', 'pypdfocr')()
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 411, in main
script.go(sys.argv[1:])
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 396, in go
filing = self.file_converted_file(ocr_pdffilename, pdf_filename)
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr.py", line 331, in file_converted_file
filed_path = self.pdf_filer.move_to_matching_folder(ocr_pdffilename)
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr_pdffiler.py", line 65, in move_to_matching_folder
for page_text in self.iter_pdf_page_text(filename):
File "c:\Python27\lib\site-packages\pypdfocr\pypdfocr_pdffiler.py", line 42, in iter_pdf_page_text
reader = PdfFileReader(filename)
File "c:\Python27\lib\site-packages\PyPDF2\pdf.py", line 800, in init
self.read(stream)
File "c:\Python27\lib\site-packages\PyPDF2\pdf.py", line 1242, in read
stream.seek(-1, 2)
AttributeError: 'unicode' object has no attribute 'seek'

Here are the contents of my config.yaml file, stored in the watch directory:

target_folder: "docs/filed"
default_folder: "docs/filed/manual_sort"
original_move_folder: "docs/originals"

folders:
default_fees:
- appraisal
other:
- text

Thanks so much, again.

virantha · 2014-04-23T20:38:34Z

Thanks for reporting this! Let me take a look at it...

nextechinc · 2014-07-18T23:22:53Z

I'm having the exact same problem.

virantha · 2014-08-18T20:16:23Z

I am unable to reproduce this under mac/linux. Could you send me the pdf you're working with?

virantha · 2014-08-18T20:17:03Z

Also, please make sure you're running version 0.7.5

virantha · 2014-09-02T02:29:09Z

Closing issue as can't reproduce.

virantha self-assigned this Apr 23, 2014

virantha closed this as completed Sep 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'unicode' object has no attribute 'seek' #14

AttributeError: 'unicode' object has no attribute 'seek' #14

tringger commented Apr 22, 2014

virantha commented Apr 23, 2014

nextechinc commented Jul 18, 2014

virantha commented Aug 18, 2014

virantha commented Aug 18, 2014

virantha commented Sep 2, 2014

AttributeError: 'unicode' object has no attribute 'seek' #14

AttributeError: 'unicode' object has no attribute 'seek' #14

Comments

tringger commented Apr 22, 2014

virantha commented Apr 23, 2014

nextechinc commented Jul 18, 2014

virantha commented Aug 18, 2014

virantha commented Aug 18, 2014

virantha commented Sep 2, 2014