Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-28002: Roundtrip f-strings with ast.unparse better #19612

Merged
merged 32 commits into from
Nov 20, 2020

Conversation

hauntsaninja
Copy link
Contributor

@hauntsaninja hauntsaninja commented Apr 20, 2020

By attempting to avoid backslashes in f-string expressions.
We also now proactively raise errors for some backslashes we can't
avoid while unparsing FormattedValues

https://bugs.python.org/issue28002

Automerge-Triggered-By: GH:isidentical

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Apr 20, 2020

Some notes:

  • I'm not particularly familiar with unicode intricacies, so it's possible there is a simpler, more robust or more performant way to do this.
  • Let me know if there are more tests that should be added.
  • Maybe we can skip-news, since ast.unparse is new in Python 3.9 anyway?
  • Part of this patch is based on @simonpercivall's work in astunparse. If there is credit-giving to be given, please let me know how he can be credited for that.

cc @isidentical

@hauntsaninja
Copy link
Contributor Author

The test failure in test_concurrent_futures on Windows x86 seems unrelated to this PR

@isidentical isidentical requested a review from pablogsal April 20, 2020 10:20
@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Apr 21, 2020

One issue with the PR as it stands is the perhaps undesirable unparsing of the following, probably reasonably common f-string:

>>> import ast
>>> print(ast.unparse(ast.parse(r'''f"{x}\n" ''')))

f'''{x}
'''

For this to pass a src roundtrip, we'd need to process the buffer a little more intelligently when self.avoid_backslashes is False.

@csabella csabella requested a review from ericvsmith April 23, 2020 12:04
@isidentical
Copy link
Member

isidentical commented May 20, 2020

@hauntsaninja would you rebase and (if you liked it) apply the suggestions. So we can move forward, and try to improve it. I have a script that I'll plan to test this PR against top PyPI packages, so we might find more problems / or verify that it works well at the end.

hauntsaninja and others added 2 commits May 21, 2020 13:04
By attempting to avoid backslashes in f-string expressions.
We also now proactively raise errors for some backslashes we can't
avoid while unparsing FormattedValues
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
@isidentical isidentical added the needs backport to 3.9 only security fixes label May 22, 2020
@isidentical
Copy link
Member

FYI, this patch fails on roundtripping this file https://github.com/numpy/numpy/blob/master/tools/refguide_check.py

@isidentical
Copy link
Member

What about not including the \n into the exceptions list in should_use_repr?

@hauntsaninja
Copy link
Contributor Author

Thanks, updated to include your two comments!

I think including \n into the exception list is reasonable, and would fix the common f"{x}\n" case. There would be a regression in being able to unparse some of the test cases I added in test_fstrings_complicated, but those are less common, so it would still be the pragmatic thing to do.

Let me see if I can come up with a fix over the weekend, otherwise we can go ahead and make the should_use_repr change.

@isidentical
Copy link
Member

A gentle ping @hauntsaninja

@hauntsaninja hauntsaninja marked this pull request as draft June 3, 2020 01:38
The previous cosmetic fix means we use triple quotes less often, so this
is less likely to occur in practice, but should be fixed anyway.
@hauntsaninja
Copy link
Contributor Author

Thanks for the reminder!

Made changes that should result in better cosmetic unparsing of cases like f"{x}\n" and fixed issues where we'd struggle because we might need to escape the final quote within triple quotes (basically more complicated versions of """\""""). These changes also don't regress against any existing test cases added in this PR.

@hauntsaninja
Copy link
Contributor Author

@isidentical Thanks for taking another look! I made the changes as suggested, except for the ones I replied to

Copy link
Member

@isidentical isidentical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablogsal would you mind checking out this?

@isidentical
Copy link
Member

Thanks for the continuous work @hauntsaninja! If you have time, please try the latest version on the most downloaded PyPI packages set, if not please let me know (so I can test it by myself).

@isidentical
Copy link
Member

Also, you should be able to collapse my reviews (mark them as resolved) so that it won't distract people when reading the history.

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Oct 18, 2020

The CI failure here looks like a flake. I re-ran AST roundtripping over 4k packages and everything was clear (apart from some pre-existing RecursionErrors)

@isidentical
Copy link
Member

A gentle ping @ericvsmith. Do you plan to take a look?

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Nov 20, 2020

I'd love for this to make 3.9.1, if possible! :-)

@isidentical
Copy link
Member

closing and re-opening for the CI

@isidentical isidentical reopened this Nov 20, 2020
Copy link
Member

@ericvsmith ericvsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. This looks reasonable enough to me, but someone who's more familiar with the AST and/or unparsing should take a look. Maybe @ambv?

@isidentical
Copy link
Member

re-triggering the CI (some irrelevant macos failure)

@isidentical isidentical reopened this Nov 20, 2020
@isidentical isidentical merged commit a993e90 into python:master Nov 20, 2020
@bedevere-bot
Copy link

@isidentical: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

Thanks @hauntsaninja for the PR, and @isidentical for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 20, 2020
By attempting to avoid backslashes in f-string expressions.
We also now proactively raise errors for some backslashes we can't
avoid while unparsing FormattedValues

Co-authored-by: hauntsaninja <>
Co-authored-by: Shantanu <hauntsaninja@users.noreply.github.com>
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
(cherry picked from commit a993e90)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Nov 20, 2020
@bedevere-bot
Copy link

GH-23430 is a backport of this pull request to the 3.9 branch.

@hauntsaninja hauntsaninja deleted the astunparse branch November 20, 2020 21:21
isidentical pushed a commit that referenced this pull request Nov 20, 2020
…-23430)

By attempting to avoid backslashes in f-string expressions.
We also now proactively raise errors for some backslashes we can't
avoid while unparsing FormattedValues

Co-authored-by: hauntsaninja <>
Co-authored-by: Shantanu <hauntsaninja@users.noreply.github.com>
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
(cherry picked from commit a993e90)

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
@isidentical
Copy link
Member

Thank you so much for your continuous efforts on this issue @hauntsaninja.

@hauntsaninja
Copy link
Contributor Author

Thank you for making this PR happen! :-)

adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
By attempting to avoid backslashes in f-string expressions.
We also now proactively raise errors for some backslashes we can't
avoid while unparsing FormattedValues

Co-authored-by: hauntsaninja <>
Co-authored-by: Shantanu <hauntsaninja@users.noreply.github.com>
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants