bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745

Tobotimus · 2017-12-07T09:31:53Z

This changes the behaviour of the pygettext TokenEater when it sees a function definition; instead of searching for the colon at the end of the definition, it skips over everything until it sees the closing parenthesis at the end of the parameter list, then looks for the final colon.

https://bugs.python.org/issue32222

the-knights-who-say-ni · 2017-12-07T09:31:55Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

serhiy-storchaka · 2017-12-07T09:41:47Z

Tools/i18n/pygettext.py

@@ -5,7 +5,7 @@
 # Minimally patched to make it even more xgettext compatible
 # by Peter Funk <pf@artcom-gmbh.de>
 #
-# 2002-11-22 J�rgen Hermann <jh@web.de>
+# 2002-11-22 J�rgen Hermann <jh@web.de>


Unwanted changes.

Now reverted.

serhiy-storchaka · 2017-12-07T09:43:05Z

Tools/i18n/pygettext.py

                return
        if ttype == tokenize.NAME and tstring in opts.keywords:
            self.__state = self.__keywordseen

+    def __funcseen(self, ttype, tstring, lineno):
+        # ignore anything until we see the closing parenthesis
+        if ttype == tokenize.OP and tstring == ')':


What about the following example?

def foo(bar=()):

That's a very good point. I will rethink my approach

merwok · 2017-12-07T16:29:32Z

Tools/i18n/pygettext.py

@@ -5,7 +5,7 @@
 # Minimally patched to make it even more xgettext compatible
 # by Peter Funk <pf@artcom-gmbh.de>
 #
-# 2002-11-22 J�rgen Hermann <jh@web.de>
+# 2002-11-22 Jürgen Hermann <jh@web.de>


I assume that the change here is a re-encoding from Latin-1 to UTF-8. If that is so, please remove the encoding line at the top of the file.

I think it is better to not include any unrelated changes. If you want to change the encoding of the file, it should be done in a separate issue. But non-default encoding may be a part of testing.

Indeed it was a change of encoding, thanks for pointing that out; my text editor seems to save files in UTF-8 by default. I've reverted the encoding to ISO 8859-1.

Tobotimus · 2017-12-07T22:13:16Z

I've made a change to account for the following example, provided by serhiy:

def foo(bar=()):

This involves counting the number of nested parentheses until we see the closing parenthesis for the outermost pair (i.e. the parameter list parens).

serhiy-storchaka

Nice. Could you please add a test?

Any chance to support return annotations?

Tobotimus · 2017-12-08T08:37:17Z

@serhiy-storchaka I am more than happy to add some tests, I'd just like to note that I cannot see any existing tests for pygettext's actual parsing behaviour in Lib/test/test_tools/test_i18n.py, so I believe these will be the first tests for this behaviour.

From my own tests, return annotations were (and are, following this PR) already supported. There are only two cases where return annotations would cause problems, albeit I think they are very uncommon:

# returning a sliced list
def foo(bar) -> List[1:2]:
    ...
# returning a lambda function
def foo(bar) -> lambda x: x:

In my opinion both of the above examples, despite being valid syntax, don't really have a use case.

serhiy-storchaka · 2017-12-08T09:22:00Z

Currently pygettext testing is fragmentary. All existing test were added just as regression tests for fixed bugs. There is the same case.

We can handle most cases by counting parenthesis, brackets and spaces.

def foo(bar) -> List[1:2]:
def foo(bar) -> {1: 2}:
def foo(bar) -> T(lambda x: x):

Only the case with a lambda at upper level is a problem. But there is a simple workaround for this -- parenthesis around a lambda.

Tobotimus · 2017-12-13T12:57:26Z

I have added some tests, and also changed up the logic a bit.

Now, when skipping over a function declaration, it will look for the first colon which isn't surrounded by parentheses, square brackets or braces (I've called them 'enclosures' as an umbrella term here).

serhiy-storchaka · 2017-12-13T20:14:27Z

I think this is too much and too small. If we are sure that the parenthesis are balanced (as it is in a syntactically correct Python), we can just count any opening parenthesis as +1 and any closing parenthesis as -1. If we want to check the balance, we should support a stack of parenthesis instead of a simple count (or several counts), and check that any closing parenthesis matches an opening parenthesis. I think the simpler solution is enough.

Parenthesis should also be counted after class:

class C(L[1:2], F({1: 2}), metaclass=M(lambda x: x)):

Tobotimus · 2017-12-13T21:45:45Z

@serhiy-storchaka I'm not sure I understand; this PR does the simpler solution as you say, assuming that the parentheses are balanced, counting +1 when opening and -1 when closing. The only difference in this solution is that it also counts square brackets and braces. Are you suggesting I remove those two extra counts?

Thank you for pointing out the class situation, I will fix that up.

serhiy-storchaka · 2017-12-13T23:24:04Z

I suggest to use a single integer counter for all three kinds of parenthesis: round parenthesis, brackets and braces.

Tobotimus · 2017-12-13T23:42:36Z

Of course, now I understand. I don't know why I thought creating three different counters was a good idea in the first place. Thank you for pointing this out

Tobotimus · 2017-12-14T02:37:20Z

The requested change has been made; this PR's changes are actually much simpler now.

serhiy-storchaka

Thank you, all LGTM now.

And the last: add please your name in Misc/ACKS and append "Patch by ." to the news entry.

serhiy-storchaka · 2017-12-15T07:29:49Z

Misc/NEWS.d/next/Tools-Demos/2017-12-07-20-51-20.bpo-32222.hPBcGT.rst

@@ -1,2 +1,3 @@
 Fix pygettext not extracting docstrings for functions with type annotated
 arguments.
+Patch by Tobotimus


Please use your real name. And don't forgot a period at the end.

Tobotimus · 2018-01-23T23:25:56Z

This is ready to merge

miss-islington · 2018-02-26T22:48:16Z

Thanks @Tobotimus for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7.
🐍🍒⛏🤖

…ints (pythonGH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>

bedevere-bot · 2018-02-26T22:49:27Z

GH-5915 is a backport of this pull request to the 3.7 branch.

bedevere-bot · 2018-02-26T22:50:26Z

GH-5916 is a backport of this pull request to the 3.6 branch.

…ints (pythonGH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>

…ints (GH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>

Fix pygettext skipping docstrings for funcs with arg typehints

00405d2

the-knights-who-say-ni added the CLA not signed label Dec 7, 2017

bedevere-bot added the awaiting review label Dec 7, 2017

Add news entry

09abc0e

serhiy-storchaka reviewed Dec 7, 2017

View reviewed changes

serhiy-storchaka added the type-bug An unexpected behavior, bug, or error label Dec 7, 2017

Revert unwanted accidental changes

bcbfd74

merwok reviewed Dec 7, 2017

View reviewed changes

Tobotimus added 2 commits December 8, 2017 08:29

Revert accidental encoding change

7f5d2c7

Skip over nested parens in param list

eb2267b

Properly reset paren count to zero

6633e80

serhiy-storchaka reviewed Dec 8, 2017

View reviewed changes

Tobotimus added 2 commits December 13, 2017 22:43

Add tests

7e974fa

Refactor logic

30cc30d

the-knights-who-say-ni added CLA signed and removed CLA not signed labels Dec 13, 2017

Tobotimus added 2 commits December 13, 2017 22:46

Removing trailing whitespace

36779af

Fix whitespace

cd6ea5c

Tobotimus added 2 commits December 14, 2017 09:00

Add tests for class

038d01c

Funcs and classes use same logic

cf12715

One integer count for all enclosures

2f3dbf5

serhiy-storchaka approved these changes Dec 14, 2017

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Dec 14, 2017

Update news entry

fc0f1ea

serhiy-storchaka reviewed Dec 15, 2017

View reviewed changes

Update news entry v2

8c633ff

serhiy-storchaka added the needs backport to 3.6 label Dec 15, 2017

serhiy-storchaka closed this Feb 26, 2018

serhiy-storchaka reopened this Feb 26, 2018

serhiy-storchaka added the needs backport to 3.7 label Feb 26, 2018

serhiy-storchaka merged commit eee72d4 into python:master Feb 26, 2018

bedevere-bot removed the awaiting merge label Feb 26, 2018

bedevere-bot removed the needs backport to 3.7 label Feb 26, 2018

bedevere-bot removed the needs backport to 3.6 label Feb 26, 2018

miss-islington added a commit that referenced this pull request Feb 26, 2018

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typeh…

51d95ff

…ints (GH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>

Tobotimus deleted the 32222-fix-pygettext-annotations branch February 26, 2018 23:25

miss-islington added a commit that referenced this pull request Feb 27, 2018

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typeh…

ec5569b

…ints (GH-4745) (cherry picked from commit eee72d4) Co-authored-by: Tobotimus <Tobotimus@users.noreply.github.com>

Tobotimus mentioned this pull request Feb 27, 2018

[V3 i18n] Internationalise help for commands and cogs Cog-Creators/Red-DiscordBot#1143

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745

Tobotimus commented Dec 7, 2017 •

edited

Loading

the-knights-who-say-ni commented Dec 7, 2017

serhiy-storchaka Dec 7, 2017

Tobotimus Dec 7, 2017

serhiy-storchaka Dec 7, 2017

Tobotimus Dec 7, 2017

merwok Dec 7, 2017

serhiy-storchaka Dec 7, 2017

Tobotimus Dec 7, 2017 •

edited

Loading

Tobotimus commented Dec 7, 2017

serhiy-storchaka left a comment

Tobotimus commented Dec 8, 2017

serhiy-storchaka commented Dec 8, 2017

Tobotimus commented Dec 13, 2017

serhiy-storchaka commented Dec 13, 2017

Tobotimus commented Dec 13, 2017

serhiy-storchaka commented Dec 13, 2017

Tobotimus commented Dec 13, 2017

Tobotimus commented Dec 14, 2017

serhiy-storchaka left a comment

serhiy-storchaka Dec 15, 2017

Tobotimus commented Jan 23, 2018

miss-islington commented Feb 26, 2018

bedevere-bot commented Feb 26, 2018

bedevere-bot commented Feb 26, 2018

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745

bpo-32222: Fix pygettext skipping docstrings for funcs with arg typehints #4745

Conversation

Tobotimus commented Dec 7, 2017 • edited Loading

the-knights-who-say-ni commented Dec 7, 2017

serhiy-storchaka Dec 7, 2017

Choose a reason for hiding this comment

Tobotimus Dec 7, 2017

Choose a reason for hiding this comment

serhiy-storchaka Dec 7, 2017

Choose a reason for hiding this comment

Tobotimus Dec 7, 2017

Choose a reason for hiding this comment

merwok Dec 7, 2017

Choose a reason for hiding this comment

serhiy-storchaka Dec 7, 2017

Choose a reason for hiding this comment

Tobotimus Dec 7, 2017 • edited Loading

Choose a reason for hiding this comment

Tobotimus commented Dec 7, 2017

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Tobotimus commented Dec 8, 2017

serhiy-storchaka commented Dec 8, 2017

Tobotimus commented Dec 13, 2017

serhiy-storchaka commented Dec 13, 2017

Tobotimus commented Dec 13, 2017

serhiy-storchaka commented Dec 13, 2017

Tobotimus commented Dec 13, 2017

Tobotimus commented Dec 14, 2017

serhiy-storchaka left a comment

Choose a reason for hiding this comment

serhiy-storchaka Dec 15, 2017

Choose a reason for hiding this comment

Tobotimus commented Jan 23, 2018

miss-islington commented Feb 26, 2018

bedevere-bot commented Feb 26, 2018

bedevere-bot commented Feb 26, 2018

Tobotimus commented Dec 7, 2017 •

edited

Loading

Tobotimus Dec 7, 2017 •

edited

Loading