Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of DEP for static-ish typechecking for Django #65

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
326 changes: 326 additions & 0 deletions draft/0484-type-hints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,326 @@
# DEP 0484: Static type checking for Django

| | |
| --- | --- |
| **DEP:** | 0484 |
| **Author:** | Maksim Kurnikov |
| **Implementation team:** | Maksim Kurnikov |
| **Shepherd:** | Carlton Gibson |
| **Type:** | Feature |
| **Status:** | Draft |
| **Created:** | 2019-10-08 |
| **Last modified:** | 2019-10-08 |

## Abstract

Add mypy (and other type checkers) support for Django.

## Motivation

### Internal use

1. Inline documentation
* Lower barrier to entry.
* Easier to think about what's going on and edgecases

2. Catching hard-to-find bugs
* not all codebase is covered with tests, so `None` related functionality is never tested, because nobody actually thought about it being used that way.

3. Another testsuite to prevent regressions
* accidental changes break other parts of the codebase
* easier review, easier to understand what contributor meant to do

### External use

1. Typechecking of user codebases.

* mypy is increasingly popular in proprietary projects.

* shorten time for new developer to get up to speed with the codebase.
* prevent bugs / regressions, basically another test suite.

* add scalability to the codebase

* third-party apps that use / modify Django internal tools would benefit from typechecking

### IDE support

* go to definition, find usages, refactorings.

Example:

```python
class User(models.Model):
name = models.CharField()

# other file
def get_username(user):
return user.name
```

Rename `name` -> `username`. Any IDE will have a hard time understanding that `user` param is a `User` instance. Trivial with type annotation.

```python
def get_username(user: User):
return user.name # will be renamed into user.username
```

* interactive typechecking of passed params and return values

* support in vscode/emacs/vim (microsoft's python-language-server), PyCharm (internal implementation of the typechecker), pyre, mypy

* inline annotations -> IDE is always up to date with code changes (no `typeshed` syncing)

## Implementation

### Migration path

1. Add `py.typed` file inside the Django top-level module, to mark that it has inline annotations.
See https://www.python.org/dev/peps/pep-0561/#packaging-type-information

2. Add `__class_getitem__` implementation to classes which need to support generic parameters. For example, for `QuerySet` class (`QuerySet[MyModel]` annotation)
https://docs.python.org/3/reference/datamodel.html#emulating-generic-types

It's just

```python
def __class_getitem__(cls, *args, **kwargs):
return cls
```

(some additional parameters checking could be added later)

3. Add `mypy` to CI, like it's done in this PR for DRF
https://github.com/encode/django-rest-framework/pull/6988

Make it pass. It would require some type annotations around the codebase, some cases should be silenced via `# type: ignore`
https://mypy.readthedocs.io/en/latest/common_issues.html#spurious-errors-and-locally-silencing-the-checker

4. Merge `django-stubs` annotations into the codebase. This one could be done incrementally, on file-per-file basis.

5. For complex cases use `.pyi` counterpart (described below). To be able to remain in sync with the codebase, `merge-pyi` tool invocation must be added to the CI.
https://github.com/google/pytype/tree/master/pytype/tools/merge_pyi


### Complementing with `.pyi` stub files for complex cases

There are some cases, for which stores type information inline creates more problems than benefits.

Django is very dynamic, some functions have more than on relationship between argument types and return value type. For those cases, there's an `@overload` clause available
https://www.python.org/dev/peps/pep-0484/#function-method-overloading

For these cases, type information should be stored in the separate stub file.
https://www.python.org/dev/peps/pep-0484/#stub-files

Examples:

* `@overload` clauses

Django is very dynamic, some functions have more than one relationship between argument types and return value type. For those cases, there's an `@overload` clause available
https://www.python.org/dev/peps/pep-0484/#function-method-overloading

Example is `__get__` method of the `Field`.
1. Returns underlying python type when called on instance.
2. Returns `Field` instance when called on class.

```python
# _ST - set type
# _GT - get type
# self: _T -> _T allows mypy to extract type of `self` and return it

class Field(Generic[_ST, _GT])
@overload
def __get__(self: _T, instance: None, owner) -> _T: ...
# Model instance access
@overload
def __get__(self, instance: Model, owner) -> _GT: ...
# non-Model instances
@overload
def __get__(self: _T, instance, owner) -> _T: ...
```

* `**kwargs` for `Field` classes

Mypy (and other typecheckers) doesn't understand `**kwargs`. There are ways to make it work via `TypedDict` and type aliases, but it's hard to read.

```python
class CharField(Field):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.validators.append(validators.MaxLengthValidator(self.max_length))
```

How it would look like, if it would be inline:

```python
class CharField(Field):
def __init__(
self,
verbose_name: Optional[Union[str, bytes]] = ...,
name: Optional[str] = ...,
primary_key: bool = ...,
max_length: Optional[int] = ...,
unique: bool = ...,
blank: bool = ...,
null: bool = ...,
db_index: bool = ...,
default: Any = ...,
editable: bool = ...,
auto_created: bool = ...,
serialize: bool = ...,
unique_for_date: Optional[str] = ...,
unique_for_month: Optional[str] = ...,
unique_for_year: Optional[str] = ...,
choices: Optional[_FieldChoices] = ...,
help_text: str = ...,
db_column: Optional[str] = ...,
db_tablespace: Optional[str] = ...,
validators: Iterable[_ValidatorCallable] = ...,
error_messages: Optional[_ErrorMessagesToOverride] = ...,
):
super().__init__(*args, **kwargs)
self.validators.append(validators.MaxLengthValidator(self.max_length))
```

### Mypy plugin support

Not all Django behavious expressable via type system - Django will have to provide those features via mypy plugin. Most of those are implemented in `django-stubs`, inside `mypy_django_plugin` sub-package.

Those are features implemented so far in the `mypy_django_plugin`:

1. Fields and managers inference.

```python
class User(models.Model):
name = models.CharField()
surname = models.CharField(null=True)

user = User()
user.name # inferred type: str
user.surname # inferred type: Optional[str]

# objects is added to every model
User.objects.get() # inferred type: User
User.objects.filter(unknown=True) # will fail with "no such field"
User.objects.filter(name=True) # will fail with "incompatible types 'bool' and 'str'"
User.objects.filter(name__iexact=True) # will fail with "incompatible types 'bool' and 'str'"
User.objects.filter(name='hello') # passes
User.objects.filter(name__iexact='hello') # passes
```

2. Typechecking for `__init__` and `create()`
```python
class User(models.Model):
name = models.CharField()
User(name=1) # fail
User(unknown=1) # fail
User(name='hello') # pass
```
same for `create()` with different `Optional`ity conditions.

3. RelatedField's support, support for different apps in the RelatedField's `to=` argument

```python
class User:
pass
class Profile:
user = models.OneToOneField(to=User, related_name='profile')

Profile().user # inferred type 'User'
User().profile # inferred type 'Profile'
```

```python
class CustomProfile:
user = models.ForeignKey(to='some_custom_app.User')
CustomProfile().user # will be correctly inferred as 'some_custom_app.models.User'
```

4. Support for unannotated third-party base models,
```python
class User(ThirdPartyModel):
pass
```
will be recognized as correct model.

6. settings support
```python
from django.conf import settings
settings.ALLOWED_HOSTS # inferred as Sequence[str]
```

7. `get_user_model()` correctly infers current user model class.


### Limitations of the plugin

0. `mypy_django_plugin` uses a mix of static and dynamic analysis to support Django features. It basically does `django.setup()` inside to gain access to all the _meta API and `Apps` information.

* users needs to be aware of that, and work on preventing side-effects of `django.setup()`, if there's any

* typechecking of Django app with invalid syntax or semantics will crash the plugin

Possible solution: reimplement all Django logic of traversing models in apps for the plugin.

1. Manager inheritance.

```python
class BaseUser(models.Model):
class Meta:
abstract = True

objects = BaseUserManager()

class User(BaseUser):
objects = UserManager()
```
Mypy will flag those `objects` managers as incompatible as they violate Liskov Substitution principle.

Possible solution: https://github.com/python/mypy/issues/7468

2. Generic parameters for `Field`

```python
class User:
name = models.CharField()
surname = models.CharField(null=True)
```

`name` and `surname` props are recognized by mypy as generic descriptors. Here's the stub for the `Field`

```python
class Field(Generic[_ST, _GT]):
def __set__(self, instance, value: _ST) -> None: ...
# class access
@overload
def __get__(self: _T, instance: None, owner) -> _T: ...
# Model instance access
@overload
def __get__(self, instance: Model, owner) -> _GT: ...
# non-Model instances
@overload
def __get__(self: _T, instance, owner) -> _T: ...

class CharField(Field[_ST, _GT]):
_pyi_private_set_type: Union[str, int, Combinable]
_pyi_private_get_type: str
```

In the plugin `django-stubs` dynamically marks `name` and `surname` as `CharField[Optional[Union[str, int, Combinable]], Optional[str]]`. We cannot use (as far as I know),

```python
class CharField(Field[Union[str, int, Combinable], str]):
Copy link

@syastrov syastrov Oct 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have a solution to this particular problem, which is only possible since the latest mypy version, 0.730 released a few weeks ago (due to python/mypy#7188).
If we overload the __new__ method and use Literal type to match when null is True or False, then we can alter which Generic parameters are passed into Field, when it is constructed.

I will give a complete working example:

file a.pyi:

from typing import (
    Generic,
    Optional,
    TypeVar,
    Union,
    overload,
)

from typing_extensions import Literal

class Combinable: ...

class Model: ...


_T = TypeVar("_T", bound="Field")
_ST = TypeVar("_ST")
_GT = TypeVar("_GT")

class Field(Generic[_ST, _GT]):
    # Here I tried to do the following to avoid duplicating this in each subclass, 
    # but I can't find a way to refer to the correct subclass (rather than Field[_ST, _GT] 
    # it should return _T[_ST, _GT], where _T is the subclass, e.g. CharField) :
    # @overload
    # def __new__(cls, null: Literal[False] = False, *args, **kwargs) -> Field[_ST, _GT]: ...
    # @overload
    # def __new__(cls, null: Literal[True], *args, **kwargs) -> Field[Optional[_ST], Optional[_GT]]: ...

    @overload
    def __get__(self: _T, instance: None, owner) -> _T: ...
    # Model instance access
    @overload
    def __get__(self, instance: Model, owner) -> _GT: ...
    # non-Model instances
    @overload
    def __get__(self: _T, instance, owner) -> _T: ...

CharField_ST = Union[str, int, Combinable] 
CharField_GT = str
class CharField(Field[_ST, _GT]):
    @overload
    def __new__(cls, null: Literal[False] = False, *args, **kwargs) -> CharField[CharField_ST, CharField_GT]: ...
    @overload
    def __new__(cls, null: Literal[True], *args, **kwargs) -> CharField[Optional[CharField_ST], Optional[CharField_GT]]: ...

file b.py:

from a import CharField, Model

class MyModel(Model):
    f1 = CharField()
    f2 = CharField(null=True)
    f3 = CharField(null=False)

reveal_type(CharField())
reveal_type(CharField(null=True))
reveal_type(CharField(null=False))
reveal_type(MyModel().f1)
reveal_type(MyModel().f2)
reveal_type(MyModel().f3)

Then, you can see that mypy type-checks applies the correct types!

$ mypy b.py
b.py:8: note: Revealed type is 'a.CharField[Union[builtins.str, builtins.int, a.Combinable], builtins.str]'
b.py:9: note: Revealed type is 'a.CharField[Union[builtins.str, builtins.int, a.Combinable, None], Union[builtins.str, None]]'
b.py:10: note: Revealed type is 'a.CharField[Union[builtins.str, builtins.int, a.Combinable], builtins.str]'
b.py:11: note: Revealed type is 'builtins.str*'
b.py:12: note: Revealed type is 'Union[builtins.str, None]'
b.py:13: note: Revealed type is 'builtins.str*'

pass
```
because then we won't be able to change generic params for `CharField` dynamically.

And it also creates a UX issue, as `Field` has two generic params which makes zero sense semantically.

Possible solution: whole bunch of `@overload` statements over `__new__` method.

3. `BaseManager.from_queryset()`, `QuerySet.as_manager()`

Not implementable as of now, see
https://github.com/python/mypy/issues/2813
https://github.com/python/mypy/issues/7266