-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745
bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
Tools/i18n/pygettext.py
Outdated
@@ -5,7 +5,7 @@ | |||
# Minimally patched to make it even more xgettext compatible | |||
# by Peter Funk <pf@artcom-gmbh.de> | |||
# | |||
# 2002-11-22 J�rgen Hermann <jh@web.de> | |||
# 2002-11-22 J�rgen Hermann <jh@web.de> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unwanted changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now reverted.
Tools/i18n/pygettext.py
Outdated
return | ||
if ttype == tokenize.NAME and tstring in opts.keywords: | ||
self.__state = self.__keywordseen | ||
|
||
def __funcseen(self, ttype, tstring, lineno): | ||
# ignore anything until we see the closing parenthesis | ||
if ttype == tokenize.OP and tstring == ')': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the following example?
def foo(bar=()):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point. I will rethink my approach
Tools/i18n/pygettext.py
Outdated
@@ -5,7 +5,7 @@ | |||
# Minimally patched to make it even more xgettext compatible | |||
# by Peter Funk <pf@artcom-gmbh.de> | |||
# | |||
# 2002-11-22 J�rgen Hermann <jh@web.de> | |||
# 2002-11-22 Jürgen Hermann <jh@web.de> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that the change here is a re-encoding from Latin-1 to UTF-8. If that is so, please remove the encoding line at the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to not include any unrelated changes. If you want to change the encoding of the file, it should be done in a separate issue. But non-default encoding may be a part of testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it was a change of encoding, thanks for pointing that out; my text editor seems to save files in UTF-8 by default. I've reverted the encoding to ISO 8859-1.
I've made a change to account for the following example, provided by serhiy: def foo(bar=()): This involves counting the number of nested parentheses until we see the closing parenthesis for the outermost pair (i.e. the parameter list parens). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Could you please add a test?
Any chance to support return annotations?
@serhiy-storchaka I am more than happy to add some tests, I'd just like to note that I cannot see any existing tests for pygettext's actual parsing behaviour in Lib/test/test_tools/test_i18n.py, so I believe these will be the first tests for this behaviour. From my own tests, return annotations were (and are, following this PR) already supported. There are only two cases where return annotations would cause problems, albeit I think they are very uncommon: # returning a sliced list
def foo(bar) -> List[1:2]:
...
# returning a lambda function
def foo(bar) -> lambda x: x: In my opinion both of the above examples, despite being valid syntax, don't really have a use case. |
Currently pygettext testing is fragmentary. All existing test were added just as regression tests for fixed bugs. There is the same case. We can handle most cases by counting parenthesis, brackets and spaces. def foo(bar) -> List[1:2]:
def foo(bar) -> {1: 2}:
def foo(bar) -> T(lambda x: x): Only the case with a lambda at upper level is a problem. But there is a simple workaround for this -- parenthesis around a lambda. |
I have added some tests, and also changed up the logic a bit. Now, when skipping over a function declaration, it will look for the first colon which isn't surrounded by parentheses, square brackets or braces (I've called them 'enclosures' as an umbrella term here). |
I think this is too much and too small. If we are sure that the parenthesis are balanced (as it is in a syntactically correct Python), we can just count any opening parenthesis as +1 and any closing parenthesis as -1. If we want to check the balance, we should support a stack of parenthesis instead of a simple count (or several counts), and check that any closing parenthesis matches an opening parenthesis. I think the simpler solution is enough. Parenthesis should also be counted after class C(L[1:2], F({1: 2}), metaclass=M(lambda x: x)): |
@serhiy-storchaka I'm not sure I understand; this PR does the simpler solution as you say, assuming that the parentheses are balanced, counting +1 when opening and -1 when closing. The only difference in this solution is that it also counts square brackets and braces. Are you suggesting I remove those two extra counts? Thank you for pointing out the |
I suggest to use a single integer counter for all three kinds of parenthesis: round parenthesis, brackets and braces. |
Of course, now I understand. I don't know why I thought creating three different counters was a good idea in the first place. Thank you for pointing this out |
The requested change has been made; this PR's changes are actually much simpler now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, all LGTM now.
And the last: add please your name in Misc/ACKS and append "Patch by ." to the news entry.
@@ -1,2 +1,3 @@ | |||
Fix pygettext not extracting docstrings for functions with type annotated | |||
arguments. | |||
Patch by Tobotimus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use your real name. And don't forgot a period at the end.
This is ready to merge |
Thanks @Tobotimus for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
…ints (pythonGH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>
GH-5915 is a backport of this pull request to the 3.7 branch. |
GH-5916 is a backport of this pull request to the 3.6 branch. |
…ints (pythonGH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>
This changes the behaviour of the pygettext TokenEater when it sees a function definition; instead of searching for the colon at the end of the definition, it skips over everything until it sees the closing parenthesis at the end of the parameter list, then looks for the final colon.
https://bugs.python.org/issue32222