Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: can only concatenate list (not "str") to list #978

Closed
SaeedEY opened this issue Jun 12, 2022 · 5 comments
Closed

TypeError: can only concatenate list (not "str") to list #978

SaeedEY opened this issue Jun 12, 2022 · 5 comments
Assignees
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF needs-pdf The issue needs a PDF file to show the problem workflow-encryption From a users perspective, encryption is the affected feature/workflow

Comments

@SaeedEY
Copy link

SaeedEY commented Jun 12, 2022

I have just tried to read a password protected pdf with the password 'D)445416D(}+587207(EIz|5994276' and rewrite it to undecrypted file then so this bug happened !

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Windows-10-***************

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.1.0

Code

from PyPDF2 import PdfFileReader, PdfFileWriter

passw = 'D)445416D(}+587207(EIz|5994276'

def decrypt_pdf(input_path, output_path, password):
  with open(input_path, 'rb') as input_file, \
    open(output_path, 'wb') as output_file:
    reader = PdfFileReader(input_file)
    reader.decrypt(password)

    writer = PdfFileWriter()

    for i in range(reader.getNumPages()):
      writer.addPage(reader.getPage(i))

    writer.write(output_file)

if __name__ == '__main__':
  # example usage:
  decrypt_pdf('input_ecrypted.pdf', 'output_decrypted.pdf', passw)
Traceback (most recent call last):
  File "........\\test.py", line 34, in <module>
    decrypt_pdf('input_ecrypted.pdf', 'output_decrypted.pdf', passw)
  File "........\\test.py", line 23, in decrypt_pdf
    reader.decrypt(list(password))
  File "..............\Python310\lib\site-packages\PyPDF2\_reader.py", line 1617, in decrypt
    return self._decrypt(password)
  File "..............\Python310\lib\site-packages\PyPDF2\_reader.py", line 1661, in _decrypt
    user_password, key = self._authenticate_user_password(password)
  File "..............\Python310\lib\site-packages\PyPDF2\_reader.py", line 1714, in _authenticate_user_password
    U, key = _alg35(
  File "..............\Python310\lib\site-packages\PyPDF2\_security.py", line 194, in _alg35
    key = _alg32(password, rev, keylen, owner_entry, p_entry, id1_entry)
  File "..............\Python310\lib\site-packages\PyPDF2\_security.py", line 65, in _alg32
    password_bytes = b_((str_(password) + str_(_encryption_padding))[:32])
TypeError: can only concatenate list (not "str") to list

PDF

I couldn't share the pdf but here is the 5 line at the tail

<</Root 2051 0 R/ID [<0cf81d43b578d44309d457f38834559d><a5c2eb78d7b35c3781681ab80b4b0235>]/Encrypt 2057 0 R/Info 1 0 R/Size 2058>>
%b05a3-dc78d-4ea54-0ea50-5.4.3
startxref
1506230
%%EOF
@SaeedEY SaeedEY added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Jun 12, 2022
@MartinThoma
Copy link
Member

Would you be so kind and look for /CFM?

I suspect to see something like this:

<< /CF << /StdCF << /AuthEvent /DocOpen /CFM /AESV3 /Length 32 >> >> /Filter /Standard /Length 256 /O <f8a1cd0f4a989b4114f4d1831d40eeede44885f7cd574b7d74b05ea74276253ad5d833ba2d2ddac3129cad731efcef60> /OE <583bce1f6132a4e73586f084652ff8214c099b779afde715aaf3e9daeb7f26c5> /P -4 /Perms <642a0991e390678578789948054269b6> /R 6 /StmF /StdCF /StrF /StdCF /U <6af059d739d8d9e68cef3e9439e45c02768a04d4f09707d91d83720c39271ccf6e91941dbd90f8af40c592db7800462f> /UE <8e2b29f2c6c540f25244286d2415e9aeaf2fb8eb16cb0ae9e8b66ca2f8bfc48f> /V 5 >>

There are a couple of encryption algorithms which we currently don't support. The PR #749 is almost finished and will be merged likely this month. That will add more encryption/decryption support, but the latest ones will still be missing.

Alternatively, at least for the moment, you can remove the password with a tool that might support this encryption type: https://askubuntu.com/q/828720/10425 (QPDF would be my best guess, but I'm uncertain if they support the latest algorithms)

@MartinThoma MartinThoma added workflow-encryption From a users perspective, encryption is the affected feature/workflow needs-change The PR/issue cannot be handled as issue and needs to be improved and removed is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF labels Jun 12, 2022
@SaeedEY
Copy link
Author

SaeedEY commented Jun 12, 2022

Hi @MartinThoma , Actually I could not find any of the given keywords such as "CF","StdCF","AuthEvent","DocOpen","AESV3" or .... in this PDF file but I know below few details about encryption which may help you improve PyPDF2 , for instance :

  • CipherMode : CBC
  • PaddingMode : PKCS7
  • BlockSize : 128
  • IV : 16 byte
  • KeySize : 256
    Also feel free to ask me any further question as required to find out the problem.

@exiledkingcc
Copy link
Contributor

PR #749 can NOT deal with that,
this encryption algorithm is defined by PDF 2.0 specification, but i can't find the specification document, so i left it unimplemented in PR #749.
maybe i could figure it out through the source code of other pdf tools later.

@MartinThoma
Copy link
Member

Ah, damn. I think I'll set up a Github organization funding page. If people / companies start supporting PyPDF2 financially, we could simply buy the PyPDF2 standard :-/

Let's see.

@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF and removed needs-change The PR/issue cannot be handled as issue and needs to be improved labels Jun 26, 2022
@MartinThoma MartinThoma added the needs-pdf The issue needs a PDF file to show the problem label Jul 9, 2022
@exiledkingcc
Copy link
Contributor

@MartinThoma this can be closed, it was fixed by #1015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF needs-pdf The issue needs a PDF file to show the problem workflow-encryption From a users perspective, encryption is the affected feature/workflow
Projects
None yet
Development

No branches or pull requests

4 participants