Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoting issue on header Reply-To and other address headers #88803

Open
Abridbus mannequin opened this issue Jul 14, 2021 · 13 comments
Open

Quoting issue on header Reply-To and other address headers #88803

Abridbus mannequin opened this issue Jul 14, 2021 · 13 comments
Labels
3.9 only security fixes topic-email type-security A security issue

Comments

@Abridbus
Copy link
Mannequin

Abridbus mannequin commented Jul 14, 2021

BPO 44637
Nosy @warsaw, @bitdancer, @thehesiod, @Julien00859, @Abridbus
PRs
  • bpo-44637: Fix DBQuote mail header refold #29881
  • Files
  • Reply-To.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-07-14.12:44:57.370>
    labels = ['type-security', 'expert-email', '3.9']
    title = 'Quoting issue on header Reply-To and other address headers'
    updated_at = <Date 2022-01-24.16:37:33.769>
    user = 'https://github.com/Abridbus'

    bugs.python.org fields:

    activity = <Date 2022-01-24.16:37:33.769>
    actor = 'Julien Castiaux'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['email']
    creation = <Date 2021-07-14.12:44:57.370>
    creator = 'Abridbus'
    dependencies = []
    files = ['50149']
    hgrepos = []
    issue_num = 44637
    keywords = ['patch']
    message_count = 12.0
    messages = ['397478', '397480', '397529', '397545', '397546', '397555', '397563', '399390', '399393', '407545', '407901', '411492']
    nosy_count = 5.0
    nosy_names = ['barry', 'r.david.murray', 'thehesiod', 'Julien Castiaux', 'Abridbus']
    pr_nums = ['29881']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue44637'
    versions = ['Python 3.9']

    @Abridbus
    Copy link
    Mannequin Author

    Abridbus mannequin commented Jul 14, 2021

    Hello,

    When using as_string() on a Reply-To header like the following:
    msg['Reply-To'] = '"foo Research, Inc. Foofoo BarBar on Summer Special Friday: 0.50 days (2021-02-31)" <catchall@foobar.exchange.com>'

    The double quote disappear, which lead to wrong header value

    See attached file for example

    @Abridbus Abridbus mannequin added type-bug An unexpected behavior, bug, or error 3.9 only security fixes topic-email labels Jul 14, 2021
    @bitdancer
    Copy link
    Member

    There is definitely a problem here, though I see a different problem when I run it (AttributeError: 'Group' object has no attribute 'local_part', presumably because of the ':' not getting escaped correctly). I believe it applies to any address header, not just Reply-To. Unfortunately I don't have time to investigate the cause, at least right now. An interesting first step on diagnosing it might be to produce a minimal example: start deleting special characters from inside that quoted string until you find the one (or ones) that is triggering it.

    @Abridbus
    Copy link
    Mannequin Author

    Abridbus mannequin commented Jul 15, 2021

    Thanks David,

    Here is some other tests I ran
    Issuing:

    But:
    msg['Reply-To'] = '"foo Research Inc Foofoo BarBar on Summer Special Friday 050 days 20210231 " <catchall@foobar.exchange.com>'

    worked. It looks more related to the length of the name than the character used.

    @bitdancer
    Copy link
    Member

    Forget what I said about my different error, I made a mistake running the test script.

    Interesting. If it is related to the length of the name, then the problem is most likely in the folding algorithm, specifically in what happens when the "display-name" token is wrapped across lines. And indeed, if we clone the SMTP policy and set the max_line_len to 1000 in your sample script. it renders the header correctly.

    The problem here is that the surrounding quotation marks are added by the 'value' property of DisplayName, but that property isn't invoked when handling parts of the display name separately during mulit-line folding. I was always bothered by the handling of the quotation marks in the part of the parser and folder dealing with quoted strings, but I never hit on a better way to do it. This, unfortunately, is going to be non-trivial problem to solve. It is probably going to require an ugly hack in the folding code :(

    Really, the handling of quoted strings throughout the _header_value_parser code is...a hack :( There are probably other places where it breaks down during multi-line folding. If we are lucky the hack can just add special handling for the quoted-string token type in the folder. If we aren't it will get messier :(

    Glancing at the folder code (it's been a long time since I worked on it), one possible approach (not necessarily the best one) would be to mark the first and last sub-tokens in a quoted-string so that folder knows to put in a leading or trailing quote mark, respectively, during folding.

    @Julien00859
    Copy link
    Mannequin

    Julien00859 mannequin commented Jul 15, 2021

    Hello David,

    I'm working in the same company as Baptiste and I'm trying to solve the problem. The issue is indeed related to the folding algorithm, the DBQUOTE character is lost in the parse_tree AST thus when the folding algo split the children to find a sweat spot to split the line it doesn't re-introduce the DBQUOTE and instead inject the content of the BareQuotedString right away.

    I'm working on a fix which consist of adding two DBQUOTE, one at the beginning and one at the end, of the BareQuotedString token when it is created (_header_value_parser.py@get_bare_quoted_string()). I was inspired by how the angles < and > are injected around the AddrSpec token in a AngleAddr token.

    Right now my fix isn't correct, there are some unittest falling. I'm trying to get it working and hopefully get back to you with a nice pull-request :)

    Regards,
    Julien

    @Julien00859
    Copy link
    Mannequin

    Julien00859 mannequin commented Jul 15, 2021

    Update, it works fine with the compat32 policy

    @bitdancer
    Copy link
    Member

    Yes, compat32 uses a different parser and folder (the legacy ones), that have a lot of small bugs relative to the RFCs (which is why I rewrote it).

    @vstinner
    Copy link
    Member

    I change the issue type to security. The bug can be abused to send emails to the wrong email address.

    @vstinner vstinner added type-security A security issue and removed type-bug An unexpected behavior, bug, or error labels Aug 11, 2021
    @Julien00859
    Copy link
    Mannequin

    Julien00859 mannequin commented Aug 11, 2021

    Hello David, Victor,

    Thank you for the triage, it reminds me about this issue. David, the
    solution I tried last month was wrong, it was breaking (for good
    reasons) tons of unittests. It seems to me that there is indeed no other
    solution than to bloat the re-folding function a bit more and to fix the
    dbquotes there as your last email suggested.

    I agree with you that the code will be even messier, honestly I spent
    quite some time understanding the _refold_parse_tree function and I
    don't feel like patching it.

    Regards,

    On 11.08.21 14:57, STINNER Victor wrote:

    STINNER Victor <vstinner@python.org> added the comment:

    I change the issue type to security. The bug can be abused to send emails to the wrong email address.

    ----------
    nosy: +vstinner
    type: behavior -> security


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue44637\>


    @bitdancer bitdancer changed the title Quoting issue on header Reply-To Quoting issue on header Reply-To and other address headers Nov 30, 2021
    @bitdancer bitdancer changed the title Quoting issue on header Reply-To Quoting issue on header Reply-To and other address headers Nov 30, 2021
    @thehesiod
    Copy link
    Mannequin

    thehesiod mannequin commented Dec 2, 2021

    btw my work-around was to set maxheaderlen=sys.maxsize, worked for AWS SES at least

    @Julien00859
    Copy link
    Mannequin

    Julien00859 mannequin commented Dec 7, 2021

    Hello there,

    There is a pull-request on github, had to modify _refold_parse_tree but I could keep the diff quite small. It is properly tested and it is waiting a review :)

    We have a patch at work so it is *absolutely not* urgent, feel free to review it *anytime*. Since we are using the Ubuntu LTS version of python, we might be interested by a backport till 3.7, quite honestly I'm happy it was flag as a security issue :D

    @Julien00859
    Copy link
    Mannequin

    Julien00859 mannequin commented Jan 24, 2022

    Hello there,

    Friendly reminder that this issue is still open and that there is a pull request ready. We continue to face the issue in production and our customers are getting upset.

    Can you provide us a schedule when this issue will be addressed? So that we can decide either to wait our to start thinking about possible mitigations our side?

    Regards,
    Julien

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @Julien00859
    Copy link

    Because the proposed PR have been staled for over three months and that we continue to get angry emails from our customers complaining about this issue, we have deployed a fix in our own application. Our fix is to truncate headers so they fit on one line.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes topic-email type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants