Skip to content

Conversation

@pradyunsg
Copy link
Member

@pradyunsg pradyunsg commented Jun 14, 2017

mypy type-checking would be a useful addition.

Ref: #4021


Quick summary of the changes in the PR (as on 2nd Sept 2017):

  • Add a new CI job that runs mypy checks
  • Add .pyi stubs for the vendored modules, autogenerated when vendoring (not shipped thanks to MANIFEST.in)
  • Add pip._internal.utils.typing module with a nice docstring on why that module exists and how it's used.
  • Minor changes in imports/control-flow to use better mypy-annotated versions (like using six.zip_longest instead of a try-except)
  • Add some annotations to get mypy passing on the codebase (all #type: Any comments)
    • These should be removed and made more specific over time.
  • Annotate pip.configuration with type comments.
    • Fixed a bug that mypy spotted in pip.configuration

@pfmoore
Copy link
Member

pfmoore commented Jun 14, 2017

+1 in principle to this - I'd like to get a feel for how useful mypy is on a project like pip. I've never used mypy before, so I'm not sure to what extent we can get benefit without needing to do a mayor exercise annotating things. I see that the mypy test failed, so obviously it's picking up some issues already - but maybe it would be worth adding some basic annotations as well, as a starting point to establish the benefits? Or would that be more reasonable to add in a follow-up PR?

@pradyunsg
Copy link
Member Author

maybe it would be worth adding some basic annotations as well, as a starting point to establish the benefits?

I guess so...


If I flip the check_untyped_defs switch, there's a lot more errors reported. Maybe a follow up PR would flip that switch and fix those errors?

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 15, 2017

Turns out typing is an important module for type-checks.

I haven't come up with any way to be able to use it cleanly other than vendoring it.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 15, 2017

I haven't come up with any way to be able to use it cleanly other than vendoring it.

And vendoring it won't work since they use different sources for Py2 vs Py3.

Adding the imports with a try-except to every module in pip doesn't seem to be a good solution. :/

Should I make an issue over at python/mypy?

@pradyunsg
Copy link
Member Author

Thanks @dstufft for pointing out that typing imports can be behind an if and still work...

@pradyunsg
Copy link
Member Author

@pfmoore I added some annotations to pip.configuration. It's a little messy up top in the if False:, since I tried to make it Py2/Py3 agnostic.

Question: Should we be running mypy on both Py2 and Py3?
Question: Assuming yes to the above, can we temporarily have them as separate jobs on the CI? I'll make a PR merging all the non-test and non-packaging tests into 2 - for py2 and py3.

@pradyunsg
Copy link
Member Author

FTR: If check_untyped_defs is enabled, there are 125 errors.

If this goes through, I'd like to see if flipping that switch is a practical benefit to pip.

@pradyunsg pradyunsg changed the title [WIP] Add mypy type-checking infrastructure Add mypy type-checking infrastructure Jun 15, 2017
@pradyunsg
Copy link
Member Author

Does this deserve a mention in the news file? Idk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in an if False block? At a minimum, there needs to be a comment here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that we don't need to actually import typing at runtime (which is usually not a problem but pip, unsurprisingly, is a special case)

We can't use typing without vendoring it, which is not possible due to the fact that it uses different sources for Py2/Py3). The next best thing is using utilizing the fact that typing import is kinda optional - python/mypy#3216 - which is what this is.


May I add a comment saying "just mypy things" and a link to the GH issue? :3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, "just mypy things" turns this into some sort of unexplained magic that people maintaining the file will cargo cult without understanding. If we're going to use typing, I think it's important that all the maintainers are able to understand it.

See my other comment - maybe we can put all of the "stuff that needs to go at the top of the file" like this into a separate piptyping module, which has an extensive comment at the top explaining why we are doing this, why pip is a special case, and providing links for anyone who needs to maintain that file, and then just import that one file at the top of each source file that uses annotations.

That way the typing black magic is encapsulated, and doesn't clutter every source file with stuff people aren't expected to understand, while still giving people who need to understand it a chance to follow what's going on.

@pfmoore
Copy link
Member

pfmoore commented Jun 15, 2017

@pradyunsg those if False blocks are ugly - can they be improved in any way? Is it possible to move all of that to a piptyping module and just import that? Do type annotations work that way? Is that something we should be using stubs files (which are something I've heard of, but I have no idea what they are for) for?

Question: Should we be running mypy on both Py2 and Py3?

I'd want to get "official" guidance on that question from the mypy docs/developers, but my instinct is that type inference should be the same in Python 2 and 3, so we shouldn't need to run mypy twice. It's a static analysis of the code, so why would it matter what runtime version of Python the tool is run with? mypy could just as easily be a C program...

Question: Assuming yes to the above, can we temporarily have them as separate jobs on the CI? I'll make a PR merging all the non-test and non-packaging tests into 2 - for py2 and py3.

See above - I don't think we need to do typing twice.

Did adding the type annotations expose any bugs? Can we get any feel for how much additional safety we get from maintaining the annotations? I'd hate to do a load of work adding and maintaining annotations and not get any benefit from them.

Does this deserve a mention in the news file?

I think so - it's an internal change, but it has a significant impact on how we maintain the sources, so it's worth noting explicitly.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 15, 2017

It's a static analysis of the code, so why would it matter what runtime version of Python the tool is run with?

Because types on Py2 and Py3 are can behave differently (eg. str + unicode vs bytes + str, stdlib?)...

Did adding the type annotations expose any bugs?

Not as yet - it's only checking parts which have explicitly been annotated. I imagine if we go about adding good type annotations, there's definitely the potential of exposing bugs.

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 15, 2017

those if False blocks are ugly

Yeah...

Is it possible to move all of that to a piptyping module and just import that?

I guess we could do the following:

from pip.utils.typing import IS_MYPY_CHECKING

if IS_MYPY_CHECKING:
    from typing import Any
    # and everything currently in if False

and just set IS_MYPY_CHECKING to False, adding all the information in its docstring?

@pradyunsg
Copy link
Member Author

doesn't clutter every source file with stuff people aren't expected to understand

As I understand it, any module that would be type-checked would need to have a typing (or similar) import in it. In pip's case, it also needs to not execute at runtime which implies that we need an if branch to prevent it from running.

@pradyunsg
Copy link
Member Author

@pfmoore I think this looks nicer. I'd like to know what you think. :)

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

I don't think IS_MYPY_CHECKING helps much, what I want is not to have a multi-line block of clutter at the top of the file at all.

What specifically concerns me about the block I highlighted is why are we importing configparser in an if False block at all? Surely if we need configparser, we already import it properly? And what difference does it make for typing anyway? We don't import os in the block, what's special about configparser?

I'm not specifically interested in having these questions answered. It's more about the general point that seeing this block of code prompts those questions from someone trying to understand what's going on, and that increases the cognitive burden needed to follow the code. You can't say "it's only for type checking" - someone changing the code will presumably want to (or be required to by policy) annotate any new code, and therefore will need to know how to do so.

I'm not aware that any other projects have flagged a need for this type of construct. Why does pip in particular need something like this - you say pip is a special case, but why?

Because types on Py2 and Py3 are can behave differently (eg. str + unicode vs bytes + str, stdlib?)...

Well, we use six to unify, and we (in theory) take care to separate (conceptual) bytes from (conceptual) Unicode. The biggest benefit I'd expect to see from typing is in helping us keep that separation, and if we have to run type checks separately for the py2 and py3 string model, that pretty much makes me think we won't get that benefit.

Copy link
Member Author

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things I missed.

pip/locations.py Outdated
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*static

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc-string could use a re-write.

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

I'm OK with

if MYPY_CHECK_RUNNING:
    import typing

as described in the referenced mypy issue. But when we have substantial blocks of code in there, I'm not so happy.

@pradyunsg
Copy link
Member Author

what I want is not to have a multi-line block of clutter at the top of the file at all.

Ah well... I don't see how we can avoid that.

we have to run type checks separately for the py2 and py3 string model

I was just speculating... My thinking is it won't hurt (much? at all?) to have an extra run on the CI.

pip is a special case, but why?

At runtime, pip can't import typing, so we need to guard the typing import in behind a condition that would never be true for mypy to work, without hampering pip's running.

I'm OK with

That would mean that to annotate with, say, Dict one would have to use typing.Dict which is a lot of text and, possibly might go too long for flake8 to be happy. The only way around that would be to create a shorter type-alias at which point you're adding more code in that if, which defeats the purpose.

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

pip is a special case, but why?

At runtime, pip can't import typing, so we need to guard the typing import in behind a condition that would never be true for mypy to work, without hampering pip's running.

No problem with this. But that justifies a 1-line import typing, not a big block of code.

I'm OK with

That would mean that to annotate with, say, Dict one would have to use typing.Dict which is a lot of text and, possibly might go too long for flake8 to be happy. The only way around that would be to create a shorter type-alias at which point you're adding more code in that if, which defeats the purpose.

I don't mind from typing import Dict, Any, or import typing as T. I bother if we have from typing import (10 lines of symbols) and I really start to worry when we need anything other than an import from the typing module (especially if there's no explanation as to why we need it).

Also, if flake8 objects to (typical) code with type annotations, then that implies to me that PEP 8 (and hence flake8) needs updating to advise on how to add type annotations, not that we should invent workarounds to satisfy style rules written before type annotations were invented 😉

BTW, I should say, I don't mean to put pressure on you here. I appreciate that you probably don't know much more about typing than I do (or maybe you do?!) and that this is a learning exercise for everyone. My guiding principle is that type annotations should be unobtrusive, but I have no idea how achievable that is. If it's not something we can manage, I'll view this PR as a useful experiment but ultimately not something we want to go ahead with. But at the moment, the jury is out on this question. (That's why my original +1 was "in principle").

@pradyunsg
Copy link
Member Author

I don't mean to put pressure on you here

Oh, No pressure. Don't worry. :P

I think someone needs poke at this from a is-this-useful-and-clean point of view and I appreciate that you're doing this. ^.^

I'll view this PR as a useful experiment

Same here.

I appreciate that you probably don't know much more about typing than I do (or maybe you do?!)

I probably know less.


we need anything other than an import from the typing module (especially if there's no explanation as to why we need it)

I guess adding a comment there explaining why that import is needed would be nice. I can do:

if MYPY_CHECK_RUNNING:
    from typing import Any, Dict, Iterable, List, NewType, Optional, Tuple
    # mypy fails to recognize the configparser import from the vendored six
    from configparser import ConfigParser

This won't work on Py2, but if we don't run mypy on that, it should be fine.

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

# mypy fails to recognize the configparser import from the vendored six

Hmm, is this something we should report to the six project? Or something we need to fix in our vendoring to make it more typing-friendly? If this import were a temporary workaround for something we knew we needed to fix properly but hadn't yet, then I'd be less bothered about it.

@pradyunsg
Copy link
Member Author

If this import were a temporary workaround for something we knew we needed to fix properly but hadn't yet, then I'd be less bothered about it.

That is exactly what it is. :)

Or something we need to fix in our vendoring to make it more typing-friendly?

This. I think we need to add stub files over from python/typeshed in a stub directory to make this happen...

@pradyunsg
Copy link
Member Author

pradyunsg commented Jun 16, 2017

@pfmoore Do you think I should add all the modules in python/typeshed's third-party folder to stubs/<py-version>/pip/_vendor/?

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

@pradyunsg I don't know. I'm reluctant to add yet more things that we need to vendor. And in particular, the fact that we only need these for the mypy check, and not at runtime, seems particularly wrong. Maybe the mypy team can advise? Or maybe we can put the stubs somewhere else in pip's source tree and set PYTHONPATH when we do the mypy run? That way, they are available at checking time, but we don't need to uselessly ship them at runtime.

@pradyunsg
Copy link
Member Author

Maybe the mypy team can advise?

I guess that's the right thing to do now...

The questions I have in my mind rn are:

  • Should we run mypy in both Python 3 and Python 2 modes?
  • How do we add support for the vendored libraries?

That way, they are available at checking time, but we don't need to uselessly ship them at runtime.

We won't ship the stub/ folder...

@pradyunsg
Copy link
Member Author

@pfmoore @dstufft @xavfernandez Could one of you ping someone from the mypy team?

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

I don't know who makes up the mypy team - maybe @JukkaL could help?

@pfmoore
Copy link
Member

pfmoore commented Jun 16, 2017

We won't ship the stub/ folder...

Apologies, I misread and thought you meant vendoring the stubs. My bad.

@pradyunsg
Copy link
Member Author

The questions I have in my mind rn are:

  • Should we run mypy in both Python 3 and Python 2 modes?
  • How do we add support for the vendored libraries?

@JukkaL the questions I want to ask you...

@pradyunsg
Copy link
Member Author

Awesome! So... now what? The other committers should also have a look at this too; right?

/ping @dstufft @xavfernandez

@pradyunsg
Copy link
Member Author

Pinging again. :)

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Aug 31, 2017
mypy doesn't seem to recognize the imports from pip._internal.req for some reason. This is a quick hacky patch for that. If someone is reading this, you probably want to look into why that's happening.

Also, Hi to you, a person from the future!
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Sep 2, 2017
from pip._vendor.requests.packages.urllib3.exceptions import (
InsecureRequestWarning,
)
from pip._internal.utils.typing import MYPY_CHECK_RUNNING
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove.

@pradyunsg
Copy link
Member Author

Rebased. All the issues that propped up due to #4700 have been fixed.

If someone wants me to clean up the commit/change history on this PR, I'm willing to do so.

@pradyunsg pradyunsg requested review from dstufft and xavfernandez and removed request for a team September 2, 2017 12:28
@dstufft dstufft merged commit 28cca11 into pypa:master Sep 4, 2017
@pradyunsg pradyunsg deleted the mypy/infrastructure branch September 12, 2017 20:06

[flake8]
# Ignoring unused imports since mypy would warn of that.
ignore = F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I've been quite busy these last months so I'm a little late in my review ^^
Couldn't we tag the unused imports specifically and keep this check in flake8 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 doesn't actually understand the mypy declarations and shouts about them. mypy (obviously) does and doesn't shout about valid usage.

IIRC, I remember reading somewhere in the mypy documentation that it does detect if a certain import is unused, which is what made me feel comfortable about silencing these warnings. Now that both linters are run one after another, I don't think it's an issue if mypy catches something instead of flake8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and I had looked at https://github.com/ambv/flake8-mypy -- The first line of the description of that project makes me think this approach is better suited.

Further, the "Two levels of type checking" section of that project's README motivated me to actually take this route.

@lock
Copy link

lock bot commented Jun 2, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

auto-locked Outdated issues that have been locked by automation C: automation Automated checks, CI etc type: maintenance Related to Development and Maintenance Processes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants