Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PEP561 #4693

Merged
merged 51 commits into from
Apr 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
96041b1
Add docs for python-executable
ethanhs Mar 7, 2018
4483b3e
Add --python-executable and --no-site-packages
ethanhs Mar 7, 2018
53ff42c
Split inference out of process_options
ethanhs Mar 7, 2018
f523efa
Change --no-site-packages to --no-infer-executable
ethanhs Mar 7, 2018
4d66827
Add --no-infer-executable to subprocessed tests
ethanhs Mar 7, 2018
1582692
Add test for python-executable and no-infer-executable
ethanhs Mar 7, 2018
65ebca9
Change no-infer-executable back to no-site-packages
ethanhs Mar 7, 2018
4467356
Add PEP 561 docs
ethanhs Mar 7, 2018
943e8e3
Add PEP 561 tests
ethanhs Mar 7, 2018
f28b901
Add PEP 561 test packages
ethanhs Mar 7, 2018
a63ccad
Merge branch 'master' of https://github.com/python/mypy into impl-pep561
ethanhs Mar 14, 2018
3a6f8fb
Add PEP 561 packages searching implementation
ethanhs Mar 14, 2018
d3106e3
Make Python inference inplace and clean up test
ethanhs Mar 14, 2018
3e94ac3
Fix testargs
ethanhs Mar 14, 2018
3b969de
Reorder FindModuleCache.find_module key
ethanhs Mar 14, 2018
89492d2
Trigger new CI run
ethanhs Mar 15, 2018
768efe0
Try reducing parrallelism
ethanhs Mar 16, 2018
54b4f04
Merge master into impl-pep561
ethanhs Mar 24, 2018
e43efee
Try reducing size of fscache
ethanhs Mar 25, 2018
090ac93
Revert travis back to using 12 concurrent processes
ethanhs Mar 26, 2018
a10541d
Try to remove all caching
ethanhs Mar 26, 2018
bfe388d
Update typeshed
ethanhs Mar 26, 2018
7d15781
Clear caches correctly
ethanhs Mar 26, 2018
3e35a57
Increase sizes of find_module caches
ethanhs Mar 26, 2018
4d53df5
This should fail
ethanhs Mar 27, 2018
8d25d5d
Remove isfile_case cache
ethanhs Mar 27, 2018
c4410e0
Move _get_site_packages_dirs
ethanhs Mar 27, 2018
73fefc4
Make lru_caches per-instance
ethanhs Mar 28, 2018
ea8541a
Add exception cache back
ethanhs Mar 28, 2018
dc7a362
Merge branch 'master' of github.com:python/mypy into impl-pep561
ethanhs Mar 29, 2018
5142add
Fold logic into subprocess call for simplicity
ethanhs Mar 29, 2018
432132f
Clarify docs and re-order
ethanhs Mar 29, 2018
3fd86ee
Make subprocess call identical to runtime code
ethanhs Mar 29, 2018
b0a302c
Add introductory text to installed_packages
ethanhs Mar 30, 2018
7cd5ebc
Move logic into new file, reduce duplication
ethanhs Mar 30, 2018
7a593b2
Just in case, don't actually import typing
ethanhs Mar 30, 2018
ddfa26a
Make docs clearer
ethanhs Apr 5, 2018
a5e636c
Code cleanup, add details of _get_site_packages_dirs
ethanhs Apr 5, 2018
262769c
Remove unneeded argument
ethanhs Apr 5, 2018
12920c9
Refactor python_executable_prefix
ethanhs Apr 5, 2018
4177257
Clarify python-executable flag
ethanhs Apr 5, 2018
ef44873
Merge branch 'master' into impl-pep561
ethanhs Apr 5, 2018
8948d6a
Remove unused argument
ethanhs Apr 5, 2018
6916b09
Change python_version -> python_executable
ethanhs Apr 5, 2018
87cbdb2
Add information about inference and add TODO
ethanhs Apr 6, 2018
e44b1f5
Refactor testpep561
ethanhs Apr 6, 2018
88279fa
Fix outdated docstring
ethanhs Apr 6, 2018
e590e66
Add docstring describing sitepkgs.py
ethanhs Apr 6, 2018
df13f2a
Add note about python_executable option
ethanhs Apr 6, 2018
6525f13
Use fscache in a few more places
ethanhs Apr 6, 2018
bc0141e
Add back some cache info in dispatch
ethanhs Apr 6, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion docs/source/command_line.rst
Original file line number Diff line number Diff line change
Expand Up @@ -365,11 +365,27 @@ Here are some more useful flags:
updates the cache, but regular incremental mode ignores cache files
written by quick mode.

- ``--python-executable EXECUTABLE`` will have mypy collect type information
from `PEP 561`_ compliant packages installed for the Python executable
``EXECUTABLE``. If not provided, mypy will use PEP 561 compliant packages
installed for the Python executable running mypy. See
:ref:`installed-packages` for more on making PEP 561 compliant packages.
This flag will attempt to set ``--python-version`` if not already set.

- ``--python-version X.Y`` will make mypy typecheck your code as if it were
run under Python version X.Y. Without this option, mypy will default to using
whatever version of Python is running mypy. Note that the ``-2`` and
``--py2`` flags are aliases for ``--python-version 2.7``. See
:ref:`version_and_platform_checks` for more about this feature.
:ref:`version_and_platform_checks` for more about this feature. This flag
will attempt to find a Python executable of the corresponding version to
search for `PEP 561`_ compliant packages. If you'd like to disable this,
see ``--no-site-packages`` below.

- ``--no-site-packages`` will disable searching for `PEP 561`_ compliant
packages. This will also disable searching for a usable Python executable.
Use this flag if mypy cannot find a Python executable for the version of
Python being checked, and you don't need to use PEP 561 typed packages.
Otherwise, use ``--python-executable``.

- ``--platform PLATFORM`` will make mypy typecheck your code as if it were
run under the the given operating system. Without this option, mypy will
Expand Down Expand Up @@ -458,6 +474,8 @@ For the remaining flags you can read the full ``mypy -h`` output.

Command line flags are liable to change between releases.

.. _PEP 561: https://www.python.org/dev/peps/pep-0561/

.. _integrating-mypy:

Integrating mypy into another Python application
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Mypy is a static type checker for Python.
command_line
config_file
python36
installed_packages
faq
cheat_sheet
cheat_sheet_py3
Expand Down
123 changes: 123 additions & 0 deletions docs/source/installed_packages.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
.. _installed-packages:

Using Installed Packages
========================

`PEP 561 <https://www.python.org/dev/peps/pep-0561/>`_ specifies how to mark
a package as supporting type checking. Below is a summary of how to create
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding some more background information here for those users that have minimal idea of how packaging in Python works. The goal would be that most users won't need to actually read PEP 561. Some ideas (maybe add a new section with background information or include these in the introduction):

  • Write something like "a package installed using pip, for example ..." to make it more explicit what sort of packages we are talking about (the term 'package' is an overloaded term).
  • Explain what "supporting type checking" means and why it's useful. This can mean that the package has inline type annotations that mypy can use to type check code that imports the package, or that the package ships stub files that mypy can use.
  • Also contrast this to typeshed stubs and discuss which stubs take precedence in case stubs exist both in typeshed and in an installed package.
  • Add note about the possibility of stubs being shipped in a separate stubs package (which might not be owned by the author of the original package).

PEP 561 compatible packages and have mypy use them in type checking.

Using PEP 561 compatible packages with mypy
*******************************************

Generally, you do not need to do anything to use installed packages that
support typing for the Python executable used to run mypy. Note that most
packages do not support typing. Packages that do support typing should be
automatically picked up by mypy and used for type checking.

By default, mypy searches for packages installed for the Python executable
running mypy. It is highly unlikely you want this situation if you have
installed typed packages in another Python's package directory.

Generally, you can use the ``--python-version`` flag and mypy will try to find
the correct package directory. If that fails, you can use the
``--python-executable`` flag to point to the exact executable, and mypy will
find packages installed for that Python executable.

Note that mypy does not support some more advanced import features, such as zip
imports, namespace packages, and custom import hooks.

If you do not want to use typed packages, use the ``--no-site-packages`` flag
to disable searching.

Making PEP 561 compatible packages
**********************************

PEP 561 notes three main ways to distribute type information. The first is a
package that has only inline type annotations in the code itself. The second is
a package that ships stub files with type information alongside the runtime
code. The third method, also known as a "stub only package" is a package that
ships type information for a package separately as stub files.

If you would like to publish a library package to a package repository (e.g.
PyPI) for either internal or external use in type checking, packages that
supply type information via type comments or annotations in the code should put
a ``py.typed`` in their package directory. For example, with a directory
structure as follows:

.. code-block:: text

setup.py
package_a/
__init__.py
lib.py
py.typed

the setup.py might look like:

.. code-block:: python

from distutils.core import setup

setup(
name="SuperPackageA",
author="Me",
version="0.1",
package_data={"package_a": ["py.typed"]},
packages=["package_a"]
)

Some packages have a mix of stub files and runtime files. These packages also
require a ``py.typed`` file. An example can be seen below:

.. code-block:: text

setup.py
package_b/
__init__.py
lib.py
lib.pyi
py.typed

the setup.py might look like:

.. code-block:: python

from distutils.core import setup

setup(
name="SuperPackageB",
author="Me",
version="0.1",
package_data={"package_b": ["py.typed", "lib.pyi"]},
packages=["package_b"]
)

In this example, both ``lib.py`` and ``lib.pyi`` exist. At runtime, the Python
interpeter will use ``lib.py``, but mypy will use ``lib.pyi`` instead.

If the package is stub-only (not imported at runtime), the package should have
a prefix of the runtime package name and a suffix of ``-stubs``.
A ``py.typed`` file is not needed for stub-only packages. For example, if we
had stubs for ``package_c``, we might do the following:

.. code-block:: text

setup.py
package_c-stubs/
__init__.pyi
lib.pyi

the setup.py might look like:

.. code-block:: python

from distutils.core import setup

setup(
name="SuperPackageC",
author="Me",
version="0.1",
package_data={"package_c-stubs": ["__init__.pyi", "lib.pyi"]},
packages=["package_c-stubs"]
)
110 changes: 75 additions & 35 deletions mypy/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,20 @@
"""
# TODO: More consistent terminology, e.g. path/fnam, module/id, state/file

import ast
import binascii
import collections
import contextlib
from distutils.sysconfig import get_python_lib
import functools
import gc
import hashlib
import json
import os.path
import re
import site
import stat
import subprocess
import sys
import time
from os.path import dirname, basename
Expand All @@ -33,6 +36,7 @@
if MYPY:
from typing import Deque

from mypy import sitepkgs
from mypy.nodes import (MODULE_REF, MypyFile, Node, ImportBase, Import, ImportFrom, ImportAll)
from mypy.semanal_pass1 import SemanticAnalyzerPass1
from mypy.semanal import SemanticAnalyzerPass2, apply_semantic_analyzer_patches
Expand Down Expand Up @@ -698,7 +702,8 @@ def correct_rel_imp(imp: Union[ImportFrom, ImportAll]) -> str:

def is_module(self, id: str) -> bool:
"""Is there a file in the file system corresponding to module id?"""
return self.find_module_cache.find_module(id, self.lib_path) is not None
return self.find_module_cache.find_module(id, self.lib_path,
self.options.python_executable) is not None

def parse_file(self, id: str, path: str, source: str, ignore_errors: bool) -> MypyFile:
"""Parse the source of a file with the given name.
Expand Down Expand Up @@ -789,6 +794,24 @@ def remove_cwd_prefix_from_path(fscache: FileSystemCache, p: str) -> str:
return p


@functools.lru_cache(maxsize=None)
def _get_site_packages_dirs(python_executable: Optional[str]) -> List[str]:
"""Find package directories for given python.

This runs a subprocess call, which generates a list of the site package directories.
To avoid repeatedly calling a subprocess (which can be slow!) we lru_cache the results."""
if python_executable is None:
return []
if python_executable == sys.executable:
# Use running Python's package dirs
return sitepkgs.getsitepackages()
else:
# Use subprocess to get the package directory of given Python
# executable
return ast.literal_eval(subprocess.check_output([python_executable, sitepkgs.__file__],
stderr=subprocess.PIPE).decode())


class FindModuleCache:
"""Module finder with integrated cache.

Expand All @@ -802,20 +825,29 @@ class FindModuleCache:

def __init__(self, fscache: Optional[FileSystemMetaCache] = None) -> None:
self.fscache = fscache or FileSystemMetaCache()
# Cache find_module: (id, lib_path) -> result.
self.results = {} # type: Dict[Tuple[str, Tuple[str, ...]], Optional[str]]
self.find_lib_path_dirs = functools.lru_cache(maxsize=None)(self._find_lib_path_dirs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru_cache uses a dictionary internally so I suspect that using lru_cache doesn't perform better. I'd prefer to not use it unless there is a clear benefit since it's not type safe (argument types can't be checked).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that the cache-correctness of lru_cache is more valuable than the type-checking provided by mypy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that we've spent more effort arguing about what to do that would have been the relative benefits of either approach :-)

We have an informal style guideline that we avoid idioms that can't be type checked unless there is particularly good reason, and here the benefit we gain at the cost of losing some typing precision is minor.

But I won't argue about this any more.

self.find_module = functools.lru_cache(maxsize=None)(self._find_module)

def clear(self) -> None:
self.find_module.cache_clear()
self.find_lib_path_dirs.cache_clear()

def _find_lib_path_dirs(self, dir_chain: str, lib_path: Tuple[str, ...]) -> List[str]:
# Cache some repeated work within distinct find_module calls: finding which
# elements of lib_path have even the subdirectory they'd need for the module
# to exist. This is shared among different module ids when they differ only
# in the last component.
self.dirs = {} # type: Dict[Tuple[str, Tuple[str, ...]], List[str]]

def clear(self) -> None:
self.results.clear()
self.dirs.clear()

def _find_module(self, id: str, lib_path: Tuple[str, ...]) -> Optional[str]:
dirs = []
for pathitem in lib_path:
# e.g., '/usr/lib/python3.4/foo/bar'
dir = os.path.normpath(os.path.join(pathitem, dir_chain))
if self.fscache.isdir(dir):
dirs.append(dir)
return dirs

def _find_module(self, id: str, lib_path: Tuple[str, ...],
python_executable: Optional[str]) -> Optional[str]:
"""Return the path of the module source file, or None if not found."""
fscache = self.fscache

# If we're looking for a module like 'foo.bar.baz', it's likely that most of the
Expand All @@ -824,15 +856,23 @@ def _find_module(self, id: str, lib_path: Tuple[str, ...]) -> Optional[str]:
# that will require the same subdirectory.
components = id.split('.')
dir_chain = os.sep.join(components[:-1]) # e.g., 'foo/bar'
if (dir_chain, lib_path) not in self.dirs:
dirs = []
for pathitem in lib_path:
# e.g., '/usr/lib/python3.4/foo/bar'
dir = os.path.normpath(os.path.join(pathitem, dir_chain))
if fscache.isdir(dir):
dirs.append(dir)
self.dirs[dir_chain, lib_path] = dirs
candidate_base_dirs = self.dirs[dir_chain, lib_path]
# TODO (ethanhs): refactor each path search to its own method with lru_cache

third_party_dirs = []
# Third-party stub/typed packages
for pkg_dir in _get_site_packages_dirs(python_executable):
stub_name = components[0] + '-stubs'
typed_file = os.path.join(pkg_dir, components[0], 'py.typed')
stub_dir = os.path.join(pkg_dir, stub_name)
if fscache.isdir(stub_dir):
stub_components = [stub_name] + components[1:]
path = os.path.join(pkg_dir, *stub_components[:-1])
if fscache.isdir(path):
third_party_dirs.append(path)
elif fscache.isfile(typed_file):
path = os.path.join(pkg_dir, dir_chain)
third_party_dirs.append(path)
candidate_base_dirs = self.find_lib_path_dirs(dir_chain, lib_path) + third_party_dirs

# If we're looking for a module like 'foo.bar.baz', then candidate_base_dirs now
# contains just the subdirectories 'foo/bar' that actually exist under the
Expand All @@ -845,26 +885,21 @@ def _find_module(self, id: str, lib_path: Tuple[str, ...]) -> Optional[str]:
# Prefer package over module, i.e. baz/__init__.py* over baz.py*.
for extension in PYTHON_EXTENSIONS:
path = base_path + sepinit + extension
path_stubs = base_path + '-stubs' + sepinit + extension
if fscache.isfile_case(path) and verify_module(fscache, id, path):
return path
elif fscache.isfile_case(path_stubs) and verify_module(fscache, id, path_stubs):
return path_stubs
# No package, look for module.
for extension in PYTHON_EXTENSIONS:
path = base_path + extension
if fscache.isfile_case(path) and verify_module(fscache, id, path):
return path
return None

def find_module(self, id: str, lib_path_arg: Iterable[str]) -> Optional[str]:
"""Return the path of the module source file, or None if not found."""
lib_path = tuple(lib_path_arg)

key = (id, lib_path)
if key not in self.results:
self.results[key] = self._find_module(id, lib_path)
return self.results[key]

def find_modules_recursive(self, module: str, lib_path: List[str]) -> List[BuildSource]:
module_path = self.find_module(module, lib_path)
def find_modules_recursive(self, module: str, lib_path: Tuple[str, ...],
python_executable: Optional[str]) -> List[BuildSource]:
module_path = self.find_module(module, lib_path, python_executable)
if not module_path:
return []
result = [BuildSource(module_path, module, None)]
Expand All @@ -884,13 +919,15 @@ def find_modules_recursive(self, module: str, lib_path: List[str]) -> List[Build
(os.path.isfile(os.path.join(abs_path, '__init__.py')) or
os.path.isfile(os.path.join(abs_path, '__init__.pyi'))):
hits.add(item)
result += self.find_modules_recursive(module + '.' + item, lib_path)
result += self.find_modules_recursive(module + '.' + item, lib_path,
python_executable)
elif item != '__init__.py' and item != '__init__.pyi' and \
item.endswith(('.py', '.pyi')):
mod = item.split('.')[0]
if mod not in hits:
hits.add(mod)
result += self.find_modules_recursive(module + '.' + mod, lib_path)
result += self.find_modules_recursive(module + '.' + mod, lib_path,
python_executable)
return result


Expand Down Expand Up @@ -2001,7 +2038,8 @@ def find_module_and_diagnose(manager: BuildManager,
# difference and just assume 'builtins' everywhere,
# which simplifies code.
file_id = '__builtin__'
path = manager.find_module_cache.find_module(file_id, manager.lib_path)
path = manager.find_module_cache.find_module(file_id, manager.lib_path,
manager.options.python_executable)
if path:
# For non-stubs, look at options.follow_imports:
# - normal (default) -> fully analyze
Expand Down Expand Up @@ -2125,12 +2163,14 @@ def dispatch(sources: List[BuildSource], manager: BuildManager) -> Graph:
graph = load_graph(sources, manager)

t1 = time.time()
fm_cache_size = manager.find_module_cache.find_module.cache_info().currsize
fm_dir_cache_size = manager.find_module_cache.find_lib_path_dirs.cache_info().currsize
manager.add_stats(graph_size=len(graph),
stubs_found=sum(g.path is not None and g.path.endswith('.pyi')
for g in graph.values()),
graph_load_time=(t1 - t0),
fm_cache_size=len(manager.find_module_cache.results),
fm_dir_cache_size=len(manager.find_module_cache.dirs),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you refactor lru_cache uses away, these could be restored. Or maybe these stats could be retrieved from the lru caches?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I should add these back based on the lru_cache.cache_info() currsize.

fm_cache_size=fm_cache_size,
fm_dir_cache_size=fm_dir_cache_size,
)
if not graph:
print("Nothing to do?!")
Expand Down
Loading