Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support type aliases in fine-grained incremental mode #4525

Merged
merged 46 commits into from
Feb 21, 2018

Conversation

ilevkivskyi
Copy link
Member

Fixes #4394

This was a bit more tricky than I expected. I think I cover all common cases and majority of corner cases:

  • Simple aliases
  • Generic aliases
  • Forward references
  • Nested aliases
  • Chained aliases

Note that among tests I added some also pass without this addition (probably because some dependencies are added by coincidence). Still there are 17 test scenarios that would fail without this change. Note that I mainly focus on false negatives, since in my experience while playing with fine-grained daemon, this is the most typical problem.

There are two weird failures that I skip, but those seem to be independent of type aliases (I checked manually that alias dependence maps are correct). The problem goes away if I just "touch" the aa.py module.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, just a bunch of minor things. Also, should there be some AST diff test cases for type aliases as well? What about AST merge test cases?

mypy/nodes.py Outdated
# Is this node refers to other node via node aliasing?
# (This is currently used for simple aliases like `A = int` instead of .type_override)
is_aliasing = False # type: bool
alias_depends_on = None # type: Set[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring for alias_depends_on.

There are starting to be quite a few alias-related attributes in symbol table nodes. Maybe it's time to add a new AST node for type aliases soon after this PR has been merged so that we can remove several of these attributes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's time to add a new AST node for type aliases soon after this PR has been merged so that we can remove several of these attributes?

Yes, I will take care of it as part of #4082

mypy/semanal.py Outdated
def add_alias_deps(self, aliases_used: Set[str]) -> None:
"""Add full names of type aliases on which the current node depends.

This is used by fine-grained incremental mode to re-chek the corresponding nodes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: re-chek

mypy/semanal.py Outdated
@@ -1178,7 +1182,8 @@ def update_metaclass(self, defn: ClassDef) -> None:
return
defn.metaclass = metas.pop()

def expr_to_analyzed_type(self, expr: Expression) -> Type:
def expr_to_analyzed_type(self, expr: Expression, *,
from_bases: Optional[ClassDef] = None) -> Type:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of argument from_bases is unclear. Maybe rename it and also document in a docstring.

mypy/semanal.py Outdated
@@ -1674,12 +1679,32 @@ def anal_type(self, t: Type, *,
tvar_scope: Optional[TypeVarScope] = None,
allow_tuple_literal: bool = False,
aliasing: bool = False,
third_pass: bool = False) -> Type:
third_pass: bool = False,
from_bases: Optional[ClassDef] = None) -> Type:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another instance of the from_bases that could be renamed.

mypy/semanal.py Outdated
self.cur_mod_node.alias_deps[from_bases].update(a.aliases_used)
return tp
self.add_alias_deps(a.aliases_used)
return tp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is a bit convoluted. What about rewriting this like this:

        if a.aliases_used:
            if from_bases is not None:
                self.cur_mod_node.alias_deps[from_bases].update(a.aliases_used)
            else:
                self.add_alias_deps(a.aliases_used)
        return t.accept(a)


[case testAliasDepsGenericMod]
from typing import Dict
A = Dict[int, str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have a user-defined generic type such as C[D, E] to get more dependencies.

x: A[int]
[builtins fixtures/dict.pyi]
[out]
<m.A> -> m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependency from T to A is missing. For example, if T gets removed A should become stale (T could be defined in another target).


[case testAliasDepsForwardMod]
x: A
A = int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, would be good to have an alias to a user-defined type for more generated dependencies.

[out]
<m.A> -> m, m.f
<m.B> -> m, m.f

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some ideas for additional test cases (not sure if all of these are worth adding):

  • Alias used as a cast target
  • Alias used in x = Foo[Alias]()
  • Alias used in NamedTuple definition
  • Alias used in TypeVar (bound or values)
  • Alias used in TypedDict definition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeVars are currently supported only partially (independently of aliases), and TypedDict has only basic support. I will therefore add only few basic tests with them. For others yes, I will add more tests.

[out]
==
b.py:2: error: Dict entry 0 has incompatible type "str": "int"; expected "str": "str"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideas for additional test cases:

  • Alias changed into a generic alias (and vice versa)
  • Changing the number of type args of a generic alias (e.g. C[T] -> C[T, S])
  • Deleting the target of a type alias or a component of the type alias target

@ilevkivskyi
Copy link
Member Author

Also, should there be some AST diff test cases for type aliases as well? What about AST merge test cases?

I thought that these are less important, since aliases should be completely expanded after semantic analysis, I think I will add few tests for these.

@ilevkivskyi
Copy link
Member Author

@JukkaL Thanks for a detailed review! I will try to finish this today.

@ilevkivskyi
Copy link
Member Author

@JukkaL I added the tests you requested and fixed corner cases. Please review again.

I am still not fluent in how dependencies are (re-)triggered, so I am not sure I choose the right balance between completeness and performance (i.e. many vs few dependencies). Please advise on this.

@ilevkivskyi
Copy link
Member Author

Hm, strange, all tests passed locally, but here in both Travis and AppVeyor errors are sorted in different order in one test. Should we somehow fix the order they are shown? (To make tests more stable.)

@ilevkivskyi
Copy link
Member Author

Sorry, I give up, this is beyond my understanding, all tests fail consistently. Independently of what is the order, the messages are shown in the opposite order.

@ilevkivskyi ilevkivskyi closed this Feb 8, 2018
@ilevkivskyi ilevkivskyi reopened this Feb 8, 2018
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! This is now much cleaner.

Can you add AST diff test cases as well (test-data/unit/diff.test)?

mypy/scope.py Outdated
@@ -0,0 +1,80 @@
"""Track current scope to easily calculate the corresponding fine-grained target.

This is currently only used by mypy.semanal and mypy.server.deps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can easily become stale. What about rewording as "TODO: Use everywhere where we track targets, including in mypy.errors".

mypy/semanal.py Outdated
"""Check if 'rvalue' represents a valid type allowed for aliasing
(e.g. not a type variable). If yes, return the corresponding type and a list of
qualified type variable names for generic aliases.
(e.g. not a type variable). If yes, return the corresponding type, a list of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Use an empty line after the first line in a multi-line docstring.

[case testAliasDepsNormalMod]
from mod import I
A = I
x: A
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get a fuller picture of aliases, you could move the alias definition to a separate module (so that there would be 3 modules: class definition, alias definition, and alias use in annotation).

<mod.I> -> <m.A>, <m.x>, m

[case testAliasDepsNormalFunc]
from mod import I
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, it would be helpful to have 3 modules here. This is probably worth it for some other test cases below as well (up to you).

mypy/nodes.py Outdated
# (This is currently used for simple aliases like `A = int` instead of .type_override)
is_aliasing = False # type: bool
# This includes full names of aliases used in this alias.
alias_depends_on = None # type: Set[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this actually is necessary. Even without this, the target that contains the alias definition should depend on these aliases, right? And change in the alias should be detected by mypy.server.astdiff. If you remove this, what will break? Maybe some symbol table differences won't then be propagated correctly -- but that could just be a bug elsewhere, which this papers over.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this shouldn't be necessary. But IIRC correctly some tests failed without this. I added a TODO about this in typeanal.py (that you also commented). I will try to remove this and re-run the tests, and if some will still fail, I will try to figure out where exactly is the problem in diff calculation and expand the TODO.

@@ -506,6 +537,13 @@ def visit_yield_from_expr(self, e: YieldFromExpr) -> None:

# Helpers

def add_type_alias_deps(self, target: str, add_trigger: bool = False) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The add_trigger argument name is unclear. Can you come up with another name? Maybe something about triggering recursively?

self.add_dependency(trigger)
self.add_dependency(trigger, target=make_trigger(target))
if o.info:
for base in non_trivial_bases(o.info):
self.add_dependency(make_trigger(base.fullname() + '.' + o.name()))
self.add_type_alias_deps(self.scope.current_full_target(), add_trigger=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment about why we have add_trigger=True. Again, I believe that this should be current_target() instead of current_full_target().

# functional enum
# type variable with value restriction

def visit_mypy_file(self, o: MypyFile) -> None:
self.scope.enter_file(o.fullname())
self.is_package_init_file = o.is_package_init_file()
self.add_type_alias_deps(self.scope.current_full_target())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be self.scope.current_target()? current_full_target() can refer to an entire class (all method bodies) which we want to avoid if possible, since it's expensive.

@@ -202,6 +221,7 @@ def visit_class_def(self, o: ClassDef) -> None:
self.add_type_dependencies(o.info.typeddict_type, target=make_trigger(target))
# TODO: Add dependencies based on remaining TypeInfo attributes.
super().visit_class_def(o)
self.add_type_alias_deps(self.scope.current_full_target())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, this should probably be current_target().

mypy/typeanal.py Outdated
# Names of type aliases encountered while analysing a type will be collected here.
# (This also recursively includes the names of aliases they depend on.)
# TODO: adding the sub-dependencies shouldn't be necessary since the logic
# of fine-grained should propagate triggered dependencies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? It would be useful to understand the precise root cause for why this is included and mention it in the TODO comment.

@ilevkivskyi
Copy link
Member Author

@JukkaL OK, I got rid of the "long-range" (i.e. indirect, recursive) dependencies for type aliases. There are few minor remaining questions we can discuss tomorrow (also I will add tests you asked). Here is a summary of how everything works:

  • Definition of a type alias creates a dependency for the enclosing scope, so that the alias will be reprocessed if it is changed. So A = int generates <A> -> __main__.
  • Use of a type alias creates a dependency for the using target on the alias. So def func(x: A): ... generates <A> -> func (for functions this also creates <A> -> <func>, but this is not something special for aliases, the same happens for other types because functions are not correctly treated in get_triggered_namespace_items)
  • Finally, type aliases depend on their expansion, aliases directly used, and type variables (for generic aliases). So, B = Dict[A, Cls] generates <A> -> <B>, <Cls> -> <B>

This is all, no other dependencies are generated, for "chained" aliases etc. indirect dependencies are re-triggered and re-processed in several passes.

There is one thing that concerns me: currently subscripted and un-subscripted aliases are processed differently so this also required some duplication of logic (this also happens in several other places). As we discussed, there will be refactoring of symbol table nodes, so there are two options:

  1. First refactor symbol table nodes, then rebase this on top which will avoid duplication of logic.
  2. First merge this, then remove duplication of logic here as part of the refactoring.

I prefer the second option, since all this picture of dependencies is now in my head and it will take time to rebuild it if I will switch to another thing.

@JukkaL
Copy link
Collaborator

JukkaL commented Feb 20, 2018

Finally, type aliases depend on their expansion, aliases directly used, and type variables (for generic aliases). So, B = Dict[A, Cls] generates <A> -> <B>, <Cls> -> <B>

What about generating dependencies like <A> -> m, <Cls> -> m instead, if the alias is defined in m? This way we'd get rid of one processing step:

  1. A is changed and it triggers <A>.
  2. We reprocess m, the module containing the type alias B.
  3. AST diff of m notices that the alias m.B has changed, so we trigger <m.B>.
  4. We reprocess each target that uses B in an annotation.

Also, do you test that changing A.__init__ causes calls to B to be reprocessed?

I prefer the second option, since all this picture of dependencies is now in my head and it will take time to rebuild it if I will switch to another thing.

Yeah, let's go with the second option (at least unless the symbol table refactoring turns out to be a simple thing).

@ilevkivskyi
Copy link
Member Author

What about generating dependencies like <A> -> m, <Cls> -> m instead, if the alias is defined in m?

I think the performance win will be pretty minor, since currently we already have <A> -> <B> and <B> -> m. So I would prefer not to do this now (I think correctness/simplicity is more important now), unless you think it is something important.

Also, do you test that changing A.__init__ causes calls to B to be reprocessed?

Yes, there are some tests still missing, I have a TODO item for this one too.

@JukkaL
Copy link
Collaborator

JukkaL commented Feb 20, 2018

I think the performance win will be pretty minor, since currently we already have <A> -> <B> and <B> -> m. So I would prefer not to do this now (I think correctness/simplicity is more important now), unless you think it is something important.

The performance win isn't really the only motivation for preferring minimal dependencies. I think that this is important due to a few reasons:

  • If we have non-minimal/optimal dependencies, it's harder to reason about bugs and correct behavior -- logs will have redundant steps, etc. This can result in developers tending to ignore suspicious-looking log data, since it's easy to just assume that it's because of known issue X, even if the logs are actually showing a previously unknown issue.
  • If we have non-minimal/optimal dependencies, bugs elsewhere in fine-grained mode may get unnoticed because of the redundant work.
  • It looks like we can reprocess everything that uses an alias B in an annotation twice: first after A changes, and second time after we've reprocessed m. The performance impact could be pretty bad in some cases.
  • Even if the performance impact of this is small, it might interact badly with other performance problems (say, if you have three issues that cause 2x extra work, you could suddenly have 8x extra work if they all happen at the same time).

@ilevkivskyi
Copy link
Member Author

This is almost ready. I still need to figure out what best to do with builtins and typing and add tests for .__init__ deps.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Just a bunch of minor things.

@@ -618,3 +618,53 @@ p = Point(dict(x=42, y=1337))
[out]
__main__.Point
__main__.p

[case testTypeAliasSimple]
A = int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all of these test cases, also include a case where there is no difference to make sure that we don't just always consider type aliases different. Example:

[case testTypeAliasSimple]
A = int
B = int
[file next.py]
A = str
B = int
[out]
__main__.A

b.py:2: error: Incompatible types in assignment (expression has type "int", variable has type "str")

[case testAliasFineChainedFunc]
# This and next don't work without <A> -> <B> intermediate deps
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment out of date?

x: a.A = int()
[out]
==
b.py:4: error: Incompatible types in assignment (expression has type "int", variable has type "str")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential new tests (if they don't already exist):

  • Create a type alias (initially there is a stale reference to a type alias).
  • Delete type alias (after update there is a stale reference to a type alias).
  • Switch a class to a type alias (or type alias to class).

<m.x> -> m
<a.A> -> m
<a> -> m
<mod.I> -> <m.x>, m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also dump dependencies in a? I'd like to see the dependency <mod.I> -> a at least.

A = I
[file mod.py]
class I: pass
[out]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again would also be good to see dependencies in a.

[case testAliasDepsNestedMod]
from mod import I, S, D
A = D[S, I]
B = D[S, A]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case would be more interesting if A and B would be defined in different modules, and if we dumped the dependencies within both modules.

[case testAliasDepsNestedFunc]
from mod import I, S, D
A = D[S, I]
B = D[S, A]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to above -- would be better to split across two modules.

[case testAliasDepsRuntime]
from mod import I, S, D
A = I
x = D[S, A]()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more interesting if A would be defined in another module.

<m.P> -> m.P
<mod.I> -> <m.P.x>, <m.P>, m, m.P

[case testAliasDepsTypedDict]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test functional definition of typed dict.

<mod.I> -> <m.x>, m
<mod.S> -> <m.x>, m

[case testAliasDepsNamedTuple]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test functional definition of named tuple.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updates -- I only have two remaining questions, otherwise looks good.

class I: pass
[builtins fixtures/dict.pyi]
[out]
<m.A> -> m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above, does this actually test that the dependency is generated, or would the dependency generate for just A = I, without the typed dict definition?

[file mod.py]
class I: pass
[out]
<m.A> -> m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually test that the dependency is generated, or would the dependency generate for just A = I, without the named tuple definition?

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- feel free to merge once the build is green.

@ilevkivskyi ilevkivskyi merged commit 8f253c2 into python:master Feb 21, 2018
@ilevkivskyi ilevkivskyi deleted the support-fine-aliases branch February 21, 2018 17:57
carljm added a commit to carljm/mypy that referenced this pull request Feb 28, 2018
* master: (27 commits)
  Don't call --strict-optional and --incremental experimental (python#4642)
  Sync typeshed (python#4641)
  Fix callable types with inconsistent argument counts (python#4611)
  Fix example (add 'class A:')
  Make psutil an optional dependency (python#4634)
  mypy and mypy_extensions aren't posix only (python#3765)
  Documentation for attr support (python#4632)
  Use read_with_python_encoding in stubgen to handle file encoding (python#3790)
  Sync typeshed (python#4631)
  Add remaining core team emails to CREDITS (python#4629)
  Fix issues with attr code. (python#4628)
  Better support for converter in attrs plugin. (python#4607)
  Clean up credits (python#4626)
  Support type aliases in fine-grained incremental mode (python#4525)
  Fine-grained: Fix crash caused by unreachable class (python#4613)
  Treat divmod like a binary operator (python#4585)
  Sync typeshed (python#4605)
  Fine-grained: Don't infer partial types from multiple targets (python#4553)
  Fine-grained: Compare symbol table snapshots when following dependencies (python#4598)
  Fix type of forward reference to a decorated class method (python#4486)
  ...
yedpodtrzitko pushed a commit to kiwicom/mypy that referenced this pull request Mar 15, 2018
Fixes python#4394

This covers all common cases and majority of corner cases:
*Simple aliases
*Generic aliases
*Forward references
*Nested aliases
*Chained aliases
* Special forms

Note that among tests added some also pass without this addition
(probably because some dependencies are added by coincidence).
Note that I mainly focus on false negatives, since in my experience
while playing with fine-grained daemon, this is the most typical
problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants