Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict enum value type for unknown member names. #9443

Merged
merged 13 commits into from
Sep 17, 2020

Conversation

mgilson
Copy link
Contributor

@mgilson mgilson commented Sep 15, 2020

Description

It is very common for enums to have homogenous member-value types.
In the case where we do not know what enum member we are dealing
with, we should sniff for that case and still collapse to a known
type if that assumption holds.

As a canonical example:

from enum import Enum

class Color(Enum):
    RED = 1
    BLUE = 2

print(Color.RED.value)
reveal_type(Color.RED.value)

def print_color_value(color: Color) -> None:
    reveal_type(color.value)  # mypy reveals this as "Any", but it should be "builtins.int"

print_color_value(Color.BLUE)

Test Plan

New unit tests have been added and all existing tests still pass.

It is very common for enums to have homogenous member-value types.
In the case where we do not know what enum member we are dealing
with, we should sniff for that case and still collapse to a known
type if that assuption holds.
@mgilson
Copy link
Contributor Author

mgilson commented Sep 15, 2020

Here I thought I was opening this PR against my fork -- I guess I need to work against public github more often. Let me know if you want me to create an issue to track this.

@@ -78,6 +78,22 @@ class SomeEnum:
"""
enum_field_name = _extract_underlying_field_name(ctx.type)
if enum_field_name is None:
# We do not know the ennum field name (perhaps it was passed to a function and we only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# We do not know the ennum field name (perhaps it was passed to a function and we only
# We do not know the enum field name (perhaps it was passed to a function and we only

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good work! Thanks for working on this. I have a few comments.

false = 0

def cannot_infer_truth(truth: Truth) -> None:
reveal_type(truth.value) # N: Revealed type is 'Any'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this shouldn't infer object instead of Any?

In favor of Any: That's what we did before in this case; it prevents false positives.

In favor of object: That's what you'd see for e.g. reveal_type("" if a else 0), assuming a has an unknown value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to just follow the existing art. I imagine that changing to object could cause some code to fail with false positives (If a user is actually using the .value, it seems unlikely that they will be restricting themselves to the interface provided by object so this would probably force them to do additional cast or other methods of type-narrowing)

With that said ... once this is implemented, it shouldn't matter to me either way. Hopefully mypy will know the types of all of my enum values since I see no reason for inhomogenous values and it should never bother me in either case :). I'll do whatever you like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's stick with tradition and keep Any.

mypy/plugins/enums.py Show resolved Hide resolved
if all(node is not None and node.type == first_node_type for node in stnodes):
underlying_type = get_proper_type(first_node_type)
if underlying_type is not None:
return underlying_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the docstring of get_proper_type() correctly, it suggest that here you should actually be returning first_node_type. And in fact, perhaps you should probably adjust the all() check to go through get_proper_type() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a stab at this, but I'm not completely sure what you're saying here :). It's unclear to me whether the result of enum_value_callback should be a ProperType or not. The existing code seems to try to return a ProperType so I just followed suit there. I did change the "all-equal" to compare the entities after figuring out their ProperType though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think I misunderstood this. I thought that using get_proper_type() would distinguish A from int if we have

A = int
def f(a: A): ...

but it doesn't seem to work that way. So you're forgiven for not understanding me. :-)

I think what you have now is fine.

mypy/plugins/enums.py Show resolved Hide resolved
@@ -92,12 +159,7 @@ class SomeEnum:
# TODO: Consider using the return type of `Enum._generate_next_value_` here?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how we get to here now ... Or previously really ...

@@ -21,6 +21,10 @@ class Enum(metaclass=EnumMeta):
_name_: str
_value_: Any

# In reality, _generate_next_value_ is python3.6 only and has a different signature.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shenanigans are happening here -- I'm not sure the best way to do this. If I use the signature (as written in typeshed), we end up with:

        @staticmethod
        def _generate_next_value_(name: str, start: int, count: int, last_values: List[Any]) -> Any: pass

That pulls in staticmethod which doesn't have a stub in lib-stub. From that point, there are a bunch of different ways we can lie (e.g. just add a self and drop the staticmethod), but it seems to me like a little lie is just as harmful as a big one (maybe more because it's less likely to be noticed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is all we need if in the future mypy and/or the test fixtures somehow get clever enough to notice the signature mismatch.

value: Any

def __init__(self) -> None: pass
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why I need to add this. It isn't present in typeshed, but my actual runs still seem to work. Without it, I have two tests that fail with complaints that auto didn't get the right number of arguments:

[case testEnumValueSomeAuto]
from enum import Enum, auto
class Truth(Enum):
    true = 8675309
    false = auto()

def infer_truth(truth: Truth) -> None:
    reveal_type(truth.value) # N: Revealed type is 'builtins.int'

And

[case testEnumValueCustomAuto]
from enum import Enum, auto
from typing import List, Any
class AutoName(Enum):

    # In `typeshed`, this is a staticmethod and has more arguments,
    # but I have lied a bit to keep the test stubs lean.
    def _generate_next_value_(self) -> str:
        return "name"

class Truth(AutoName):
    true = auto()
    false = auto()

def infer_truth(truth: Truth) -> None:
    reveal_type(truth.value) # N: Revealed type is 'builtins.str'

If anyone can help me reason through why that would be, I'm happy to dig deeper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because auto inherits Enum.__new__ and that has a mandatory argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought, but as far as I can tell, the same thing happens in the official stubs in typeshed -- but we don't see this complaint when I run it for real. It also only complains for these two tests (which admittedly are doing something a little special with auto) -- even though there are a lot of other tests that use auto. I've never really seen mypy come up with a different required signature for something based on surrounding context. Maybe there is something interesting happening in the plugin that I haven't quite grokked yet -- particularly when mixed with the slightly less-than-usual stub setup for these tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding [builtins fixtures/primitives.pyi] to those two tests resolves this. I think I have an idea: without that, you get int and object from the very basic lib-stub/builtins.pyi, and there int has no __init__. But in fixtures/primitives.pyi there is an int.__init__.

The secret is in the MRO of auto, which is (in either case) auto, IntFlag, int, Flag, Enum, object. With primitives.pyi, we pick up __init__ from int. But without it, we pick it up from Enum... And Enum.__init__ requires an argument.

So it's purely a test issue, and IMO the fix is to add [builtins fixtures/primitives.pyi] and get rid of auto.__init__. (And of the blank line before value: Any.)

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has mushroomed a bit, but I like how it's working now. I have a few style nits (you can ignore those if you want to) and then when you're ready I'll approve and merge. Thanks again for this PR!

value: Any

def __init__(self) -> None: pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because auto inherits Enum.__new__ and that has a mandatory argument.

mypy/plugins/enums.py Outdated Show resolved Hide resolved
mypy/plugins/enums.py Outdated Show resolved Hide resolved
mypy/plugins/enums.py Outdated Show resolved Hide resolved
mypy/plugins/enums.py Outdated Show resolved Hide resolved
if all(node is not None and node.type == first_node_type for node in stnodes):
underlying_type = get_proper_type(first_node_type)
if underlying_type is not None:
return underlying_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think I misunderstood this. I thought that using get_proper_type() would distinguish A from int if we have

A = int
def f(a: A): ...

but it doesn't seem to work that way. So you're forgiven for not understanding me. :-)

I think what you have now is fine.

false = 0

def cannot_infer_truth(truth: Truth) -> None:
reveal_type(truth.value) # N: Revealed type is 'Any'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's stick with tradition and keep Any.

@@ -21,6 +21,10 @@ class Enum(metaclass=EnumMeta):
_name_: str
_value_: Any

# In reality, _generate_next_value_ is python3.6 only and has a different signature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is all we need if in the future mypy and/or the test fixtures somehow get clever enough to notice the signature mismatch.

int_type = ctx.api.named_generic_type('builtins.int', [])
return int_type
return get_proper_type(node_type.ret_type)
if not (isinstance(proper_type, Instance) or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and!

@gvanrossum
Copy link
Member

gvanrossum commented Sep 17, 2020

I don't understand the Travis failures. These seem to be due to some bad code generated by mypyc. We may have to consult a mypy mypyc expert.

@gvanrossum
Copy link
Member

@TH3CHARLie Could you have a look at the mypyc test failures on Travis here? I see this error:

build/__native_dbbd1b679b3c31e8ff5a.c:972286:15: error: incompatible integer to
      pointer conversion assigning to 'PyObject *' (aka 'struct _object *') from
      'int' [-Werror,-Wint-conversion]
    cpy_r_r26 = 1;
              ^ ~
1 error generated.

How can we debug this further?

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Sep 17, 2020

@TH3CHARLie Could you have a look at the mypyc test failures on Travis here? I see this error:

build/__native_dbbd1b679b3c31e8ff5a.c:972286:15: error: incompatible integer to
      pointer conversion assigning to 'PyObject *' (aka 'struct _object *') from
      'int' [-Werror,-Wint-conversion]
    cpy_r_r26 = 1;
              ^ ~
1 error generated.

How can we debug this further?

I am guessing this is an incorrect true op assignment though I am not sure why the types are incompatible, I am now compiling this branch to find out.

Updates:
The cause is the next operation whose result type doesn't get proper coerced.

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Sep 17, 2020

@gvanrossum I think this is a mypyc bug that mypyc has incorrectly implement the next op since I can reproduce the error with commit b8bdc26 (before all the GSOC work happens).

A minimal example:

from typing import List, Optional

def foo(x: List[int]) -> Optional[int]:
    res = next(
        (i for i in x if i > 4), None)
    if res is None:
        return None
    else:
      return res

@mgilson While we are fixing this mypyc bug, you may try to replace next with explicit list operations to pass the build and refactor this when the fix is merged.

@gvanrossum
Copy link
Member

I think this is a mypyc bug that mypyc has incorrectly implement the next op since I can reproduce the error with commit b8bdc26 (before all the GSOC work happens).

Thanks @TH3CHARLie for the quick diagnosis! Sorry if it sounded like I was pointing a finger to your GSoC work -- I didn't mean that, I just thought you'd be quicker at figuring this out than me, and Jukka's asleep. :-) Do you want to file an issue or do you have a good idea on how to fix it already?

@mgilson If you can rewrite the code without using next(<genexp>, None) we can land this right away. Maybe define a function like this?

def first(it: Iterable[T]) -> Optional[T]:
    for x in it:
        return x
    return None

@TH3CHARLie
Copy link
Collaborator

TH3CHARLie commented Sep 17, 2020

Sorry if it sounded like I was pointing a finger to your GSoC work

I wasn't considering this at all, it's just to give a clear scope of where this would happen. Since the GSOC project involves plenty of refactoring, making sure it is not originating from that helps us to differentiate a migration miss and an existing bug.

Do you want to file an issue or do you have a good idea on how to fix it already?

I am still trying to produce a fix but most likely this would require a specialized next op. I'll file an issue at the mypyc repository to track the progress.

@mgilson
Copy link
Contributor Author

mgilson commented Sep 17, 2020

I removed the next from in there. Hopefully it passes on the next run 🤞

@gvanrossum gvanrossum merged commit 37777b3 into python:master Sep 17, 2020
taljeth added a commit to taljeth/mypy that referenced this pull request Feb 10, 2021
In python#9443, some code was added to predict the type of enum values where
it is not explicitly when all enum members have the same type.

However, it didn't consider that subclasses of Enum that have a custom
__new__ implementation may use any type for the enum value (typically it
would use only one of their parameters instead of a whole tuple that is
specified in the definition of the member). Fix this by avoiding to
guess the enum value type in classes that implement __new__.

In addition, the added code was buggy in that it didn't only consider
class attributes as enum members, but also instance attributes assigned
to self.* in __init__. Fix this by ignoring implicit nodes when checking
the enum members.

Fixes python#10000.
hauntsaninja pushed a commit that referenced this pull request Mar 2, 2021
In #9443, some code was added to predict the type of enum values where
it is not explicitly when all enum members have the same type.

However, it didn't consider that subclasses of Enum that have a custom
__new__ implementation may use any type for the enum value (typically it
would use only one of their parameters instead of a whole tuple that is
specified in the definition of the member). Fix this by avoiding to
guess the enum value type in classes that implement __new__.

In addition, the added code was buggy in that it didn't only consider
class attributes as enum members, but also instance attributes assigned
to self.* in __init__. Fix this by ignoring implicit nodes when checking
the enum members.

Fixes #10000.

Co-authored-by: Kevin Wolf <mail@kevin-wolf.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants