Skip to content

Scan fails on PDF file #56

@pombredanne

Description

@pombredanne

The file at https://www.broadcom.com/collateral/pg/5756M-PG101-R.pdf fails to be scanned.
This is a bug in pdfminer. See euske/pdfminer#118

wget https://www.broadcom.com/collateral/pg/5756M-PG101-R.pdf
python -c "from pdfminer.pdfparser import PDFParser;p=PDFParser(open('5756M-PG101-R.pdf','rb'));from pdfminer.pdfdocument import PDFDocument;PDFDocument(p)" 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "[...]/local/lib/python2.7/site-packages/pdfminer/pdfdocument.py", line 575, in __init__
    self._initialize_password(password)
  File "[...]/local/lib/python2.7/site-packages/pdfminer/pdfdocument.py", line 598, in _initialize_password
    raise PDFEncryptionError('Unknown algorithm: param=%r' % param)
pdfminer.pdfdocument.PDFEncryptionError: Unknown algorithm: param={u'EncryptMetadata': False, u'CF': {u'StdCF': {u'Length': 16, u'CFM': /V2, u'AuthEvent': /DocOpen}}, u'O': '\xc6\xa4\xb4%\xed\xda\xe8\x7f&\xd2\x97\x840y\xc7\xbe!N\xdb\xfbw\x0f\x04\xb3iZTn\n\xc3\x93c', u'Filter': /Standard, u'P': -1324, u'Length': 128, u'R': 4, u'U': '\xf3\xa1\xeb\xa5\x19\x8a\x15%\x001\x13CenHO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', u'V': 4, u'StmF': /StdCF, u'StrF': /StdCF}

Note that on Linux using:

 wget https://www.broadcom.com/collateral/pg/5756M-PG101-R.pdf
 pdfseparate -f 1 -l 1 5756M-PG101-R.pdf  5756M-PG101-R-p1.pdf

creates a single page small PDF doc that has the same issue as the full doc

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions