-
-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quoting issue on header Reply-To and other address headers #88803
Comments
Hello, When using as_string() on a Reply-To header like the following: The double quote disappear, which lead to wrong header value See attached file for example |
There is definitely a problem here, though I see a different problem when I run it (AttributeError: 'Group' object has no attribute 'local_part', presumably because of the ':' not getting escaped correctly). I believe it applies to any address header, not just Reply-To. Unfortunately I don't have time to investigate the cause, at least right now. An interesting first step on diagnosing it might be to produce a minimal example: start deleting special characters from inside that quoted string until you find the one (or ones) that is triggering it. |
Thanks David, Here is some other tests I ran
But: worked. It looks more related to the length of the name than the character used. |
Forget what I said about my different error, I made a mistake running the test script. Interesting. If it is related to the length of the name, then the problem is most likely in the folding algorithm, specifically in what happens when the "display-name" token is wrapped across lines. And indeed, if we clone the SMTP policy and set the max_line_len to 1000 in your sample script. it renders the header correctly. The problem here is that the surrounding quotation marks are added by the 'value' property of DisplayName, but that property isn't invoked when handling parts of the display name separately during mulit-line folding. I was always bothered by the handling of the quotation marks in the part of the parser and folder dealing with quoted strings, but I never hit on a better way to do it. This, unfortunately, is going to be non-trivial problem to solve. It is probably going to require an ugly hack in the folding code :( Really, the handling of quoted strings throughout the _header_value_parser code is...a hack :( There are probably other places where it breaks down during multi-line folding. If we are lucky the hack can just add special handling for the quoted-string token type in the folder. If we aren't it will get messier :( Glancing at the folder code (it's been a long time since I worked on it), one possible approach (not necessarily the best one) would be to mark the first and last sub-tokens in a quoted-string so that folder knows to put in a leading or trailing quote mark, respectively, during folding. |
Hello David, I'm working in the same company as Baptiste and I'm trying to solve the problem. The issue is indeed related to the folding algorithm, the DBQUOTE character is lost in the parse_tree AST thus when the folding algo split the children to find a sweat spot to split the line it doesn't re-introduce the DBQUOTE and instead inject the content of the BareQuotedString right away. I'm working on a fix which consist of adding two DBQUOTE, one at the beginning and one at the end, of the BareQuotedString token when it is created (_header_value_parser.py@get_bare_quoted_string()). I was inspired by how the angles < and > are injected around the AddrSpec token in a AngleAddr token. Right now my fix isn't correct, there are some unittest falling. I'm trying to get it working and hopefully get back to you with a nice pull-request :) Regards, |
Update, it works fine with the compat32 policy |
Yes, compat32 uses a different parser and folder (the legacy ones), that have a lot of small bugs relative to the RFCs (which is why I rewrote it). |
I change the issue type to security. The bug can be abused to send emails to the wrong email address. |
Hello David, Victor, Thank you for the triage, it reminds me about this issue. David, the I agree with you that the code will be even messier, honestly I spent Regards, On 11.08.21 14:57, STINNER Victor wrote:
|
btw my work-around was to set maxheaderlen=sys.maxsize, worked for AWS SES at least |
Hello there, There is a pull-request on github, had to modify We have a patch at work so it is *absolutely not* urgent, feel free to review it *anytime*. Since we are using the Ubuntu LTS version of python, we might be interested by a backport till 3.7, quite honestly I'm happy it was flag as a security issue :D |
Hello there, Friendly reminder that this issue is still open and that there is a pull request ready. We continue to face the issue in production and our customers are getting upset. Can you provide us a schedule when this issue will be addressed? So that we can decide either to wait our to start thinking about possible mitigations our side? Regards, |
Because the proposed PR have been staled for over three months and that we continue to get angry emails from our customers complaining about this issue, we have deployed a fix in our own application. Our fix is to truncate headers so they fit on one line. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: