Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypt/Decrypt Mailbox urls #198

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
bb5fba8
FIX - Django settings fix to unicode
lologf Jan 16, 2019
c706e48
Merge pull request #1 from invisiblebits/STC-668__create-tickets-from…
lologf Jan 16, 2019
724e804
FIX - Fix default charset
lologf Jan 16, 2019
fe1367c
Merge pull request #2 from invisiblebits/STC-668__create-ticket-from-…
lologf Jan 16, 2019
b797216
Update models.py
lologf Jan 17, 2019
6ba66ed
Merge pull request #3 from invisiblebits/FIX_unicode_subject_message
lologf Jan 17, 2019
3f22f48
Update models.py
lologf Jan 18, 2019
b390b28
Update models.py
lologf Jan 18, 2019
fd586e8
Merge pull request #4 from invisiblebits/FIX_remove-single-quotes-in-…
lologf Jan 18, 2019
72582ab
Update admin.py
lologf Apr 9, 2019
11a05a8
Encrypt, decrypt and padding in model methods
lologf Apr 9, 2019
850f86a
Pycryto added in requirements
lologf Apr 9, 2019
e073a8c
Remove URI from list_display
lologf Apr 9, 2019
cd1e4de
Help text in uri form
lologf Apr 9, 2019
6f23c59
Decrypt URI in _protocol_info() method
lologf Apr 9, 2019
66f1254
Merge pull request #5 from invisiblebits/encrypt-uri
lologf Apr 9, 2019
934c799
Update setup.py
lologf Apr 9, 2019
0d5f395
Fix decrypt_uri() and _protocol_info() methods
lologf Apr 12, 2019
6c6b23c
Update models.py
lologf May 24, 2019
edc27af
Merge pull request #6 from invisiblebits/lologf-patch-raw-unicode
lologf May 24, 2019
e713b5c
Update models.py
lologf May 24, 2019
6136dea
Update models.py
lologf May 24, 2019
204d67a
Update models.py
lologf May 24, 2019
625e166
Update __init__.py
lologf Jun 25, 2019
be3163e
Remove fix UTF-8
lologf Oct 31, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion django_mailbox/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,7 @@ def _process_message(self, message):
msg.subject = (
utils.convert_header_to_unicode(unicode(message['subject']).decode('utf-8'))[0:255]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rather surprising change; could you elaborate on how this helps, exactly?

Copy link
Author

@lologf lologf Jan 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working in a app with python 2.7, django 1.11 and production database in 'utf8' charset. And i need to use django-mailbox to receive emails. If an email have a 'emoji' in subject, Django return a OperationalError. I should not change the character set to 'utf8mb4' in production. This fix (I don't know another way to do it, in utils.convert_header_to_unicode perhaps?) allow receive emails with emojis in django 1.11, python 2.7 and utf8 charset and collation

Before this fix: Django return a OperationalError
After this fix: Email subject with unicode emojis: "Resume of your a\xc3\xb1o with \xf0\x9f\x9a\x80"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I understand that you believe that this fixes the issue you're encountering, but what I meant was, specifically, how does the above change help that, really; consider this:

There are two possibilities here; one is that message['subject'] is a unicode object and the other is that it's bytes; given your example emoji of 🚀, that means we have two possibilities:

If it's bytes:

value = unicode('\xf0\x9f\x9a\x80')
# Will raise the following exception:
# Traceback (most recent call last):
#  File "<stdin>", line 1, in <module>
# UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128)

If it's unicode:

value = unicode(u'\U0001f680')
# Now let's try running 'decode'
value.decode('utf-8')
# Will raise the following exception:
# Traceback (most recent call last):
#  File "<stdin>", line 1, in <module>
#  File "/var/www/envs/latestrevision/lib/python2.7/encodings/utf_8.py", line 16, in decode
#    return codecs.utf_8_decode(input, errors, True)
# UnicodeEncodeError: 'ascii' codec can't encode character u'\U0001f680' in position 0: ordinal # not in range(128)

There are a couple things to be learned from the above:

  • Using unicode without supplying an encoding to use will attempt to interpret the provided string using your default encoding (sys.getdefaultencoding()). In most peoples' cases, that encoding is going to be ascii, and that is certainly not going to work for codepoints above 127.
  • decode is intended to be used for converting bytes into unicode objects -- not for converting unicode objects into anything at all -- so when you run decode on a unicode object, you're actually asking python to re-interpret your object into your default encoding, then to decode those bytes using the encoding you've selected. This is also not going to help you get the result you want, but is one of the more common misunderstandings of how unicode and bytes objects work in Python.

Copy link
Author

@lologf lologf Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. If I explain how I got to this point we can better understand the solution to the problem.

I use "(message ['subject]).decode(' utf-8 ')" to force the utf-8 encoding, which is the encoding that I have configured by default in my django app and in my production database.

I thought that the variable 'DJANGO_MAILBOX_default_charset' contained in utils.get_settings () could help me, but I saw that being lowercase django does not detect it as settings. I made a fix to capitalize it and force the 'default_charset' to be utf-8, but it still gave the same OperationalError.

I read several articles where they indicated that I had to change all the tables and columns of the production databases to 'utf8mb4', since the 'emojis' use 4 bytes to represent it in unicode.

But I can not change that encoding in my production database and I do not care that the emoji is represented as bytes in the subject.

My intention is to use django-mailbox to automate actions when receiving emails, and I do not care that the emoji is not represented correctly. What I want is that django does not return an OperationalError if I do not have the encoding to 'utf8mb4'.

I understand that this conversion from header to unicode should be done by the function utils.covert_header_to_unicode(), but I made the fix in _models.Mailbox.process_message() as workaround.

When making the decode, it returns a string "'=?Utf-8?Bxxxxxxxxxxxx ...'" which is a MIME header. This string is converted to a readable string with "email.header.decode_header (msg.subject)".

And at this point my question is, is there any way to use django-mailbox without the encoding 'utf8mb4' in the production database if i received an email with a "emoji"?. Thanks for everything

)
msg.subject = repr(email.header.decode_header(msg.subject)[0][0])
msg.subject = repr(email.header.decode_header(msg.subject)[0][0]).replace("'", "")
if 'message-id' in message:
msg.message_id = message['message-id'][0:255].strip()
if 'from' in message:
Expand Down