Closed
Description
I have a PDF which PDFFileReader is unable to read the text , instead this is the output:
u'\n˘ˇˆ˘ˇ˙˝˛˛˚˜ !!"#$%&"˝˛˝˘˛˘˛˚˙˘ˇ˝˛˘˛$\'(˘%˘ˇ˘ˆ˘)_)˛\'+,-)"˛./0"0!123˛"4˙"5)46)!6"˙˘˘˘,˘ˇˆ˙˙ˆ˝˛˚˜ !˘ˇˆ˙˝"" ˜#˝$˛˚˜ ˆ˙˝"" ˜ %˛˚˜ !˛˚ˇ!"#$%˘ˇ&ˆ˙˝˛˝ˆ˙&˚˝\'˛˚&\'()_ˇ+˙˝"" ˜#˝$˜#( ˛˚(ˇ+,˘˘˘ˇˆˆˆˇ,ˆ--ˆˇˇ˙˝˝% ˜)˜#_#˝$$˜ ˙ ˝_˛˚ˆ-&ˆ!ˆˇ&˘+$ˆ(˙˝+˚˜,!˛˚./&0ˆˆ+$ˆ(˙˝-˛-,&˘˝ˆ. ˚%˝% ˜)˜#\* ˜!˛˚&ˆˇ%ˆ!&(12+3ˇ˙˝,˜ˆ/˛˚%#"+3("ˆˇ.!ˆˇ43ˇ(˙-,&53ˇ6ˆˇ,˝˝% ˜)˜#\* ˜!˛˚(77777777777˜#( 0123& ˜"" ˜ %˛˚˜ 77777777777˜#( _ˆ_˛ ,4+#(56˝% ˜)˜#\* ˜!7 56 _˜ˆ( %!_ˆ_˛ ˆ˙&˚˝\'586"ˇ+((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((&\'()_&\'(_&\'()˘536((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((&\'&\' &\'˜ ˜˙˚ˆ-",ˇˆˇ!ˆ-ˆ,ˆ&ˆ!ˆˇ&53ˇ6ˆˇ,(˙˚&ˆ!-ˇ!6ˆˇ,˘ 8-ˇˆ-˙˝˝% ˜)˜#_ ˜!7 ˛˚(˙˚9ˇˇˆ-6ˆˇ,:;ˇˇˆ-<ˆˆ-ˇ&\' ,,˘˘ˇˇˆ-(9ˆˇˆ-!˘ˇˆ9˘ˆˇ˘˘(\n\n'
This is the output after Extract Text and it doesnot throw any error message.
A similar issue has been posted here:
http://stackoverflow.com/questions/15583535/how-to-extract-text-from-a-pdf-file-in-python
I am using windows so the solution in link is not helpful