Skip to content

Queued Mail Delivery Sets Wrong Charset #41

Open
@treinhard

Description

@treinhard

Sending an email via pyramid_mailer's send_to_queue() and then delivering it with the qp console script results in an incorrect charset of us-ascii in the Content-Type on Python 3.6.2.

The initial Message with unicode content is correctly encoded with iso-8859-1 and the charset is set on the message:

>>> from pyramid_mailer.message import Message
>>> msg = Message(subject='Test', sender='test@seantis.ch', recipients=['test@seantis.ch'], body='Test französisches Email')
>>> msg = msg.to_message()
>>> msg.get_charset()
'iso-8859-1'
>>> msg.as_string()
'Content-Type: text/plain; charset="iso-8859-1"\nMIME-Version: 1.0\nContent-Transfer-Encoding: quoted-printable\nFrom: test@seantis.ch\nSubject: Test\nTo: test@seantis.ch\nContent-Disposition: inline\n\nTest=20franz=F6sisches=20Email'

The content is written to a file and later (during delivery) parsed in the QueueProcessor:

>>> from email.parser import Parser
>>> from io import StringIO
>>> msg = parser.parse(StringIO(msg.as_string()))
>>> print(msg.get_charset())
None
>>> msg.get_content_charset()
'iso-8859-1'
>>> msg.as_string()
'Content-Type: text/plain; charset="iso-8859-1"\nMIME-Version: 1.0\nContent-Transfer-Encoding: quoted-printable\nFrom: test@seantis.ch\nSubject: Test\nTo: test@seantis.ch\nContent-Disposition: inline\n\nTest=20franz=F6sisches=20Email'

Message looks ok (except that msg.get_charset() returns now None). SMTPMailer.send() runs this message through repoze.sendmail.encoding.cleanup_message() which replaces the initial message charset of iso-8859-1 with us-ascii:

>>> from repoze.sendmail.encoding import cleanup_message
>>> msg = cleanup_message(msg)
>>> msg.get_charset()
'us-ascii'
>>> msg.get_content_charset()
'us-ascii'
>>> msg.as_string()
'MIME-Version: 1.0\nContent-Transfer-Encoding: quoted-printable\nFrom: test@seantis.ch\nSubject: Test\nTo: test@seantis.ch\nContent-Disposition: inline\nContent-Type: text/plain; charset="us-ascii"\n\nTest=20franz=F6sisches=20Email'

The message.get_charset() call at https://github.com/repoze/repoze.sendmail/blob/master/repoze/sendmail/encoding.py#L74 returns None (because the charset is None after parsing the message from the file). The fallback on the following lines results in a us-ascii encoding because the message is already encoded.

The result is a message with Content-Type: text/plain; charset="us-ascii" containing iso-8859-1 encoded content.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions