Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type refinement in function call #5206

Closed
dfroger opened this issue Jun 12, 2018 · 21 comments
Closed

type refinement in function call #5206

dfroger opened this issue Jun 12, 2018 · 21 comments

Comments

@dfroger
Copy link
Contributor

dfroger commented Jun 12, 2018

This is ok for mypy:

from typing import Union

def foo(x: Union[int, str]):
    if isinstance(x, int):
        return x + 10

but how can I make it work if the isinstance test is in a function call?

from typing import Union

def check(x: Union[int, str]) -> bool:
    return isinstance(x, int)

def foo(x: Union[int, str]):
    if check(x):
        return x + 10

I get:

foo.py:8: error: Unsupported operand types for + ("Union[int, str]" and "int")
@JelleZijlstra
Copy link
Member

This could perhaps be done with literal types in the future, something like this:

@overload
def check(x: int) -> Literal[True]: ...
@overload
def check(x: str) -> Literal[False]: ...

@gvanrossum
Copy link
Member

gvanrossum commented Jun 12, 2018 via email

@gwerbin
Copy link
Contributor

gwerbin commented Jun 12, 2018

I've had this issue recently too. I definitely think Mypy would benefit from some way to refine types by following conditional paths. Unfortunately implementing that kind of logic is well beyond my expertise.

AFAIK you would still have the problem if you wrote if True instead of if check(x), so having literal return types probably won't fix this specific problem.

@elazarg
Copy link
Contributor

elazarg commented Jun 13, 2018

Related: #1203, #2357 #4063, #3062

@ilevkivskyi
Copy link
Member

Here I copy the relevant link from one of the above issues I just closed microsoft/TypeScript#1007

@dbrgn
Copy link

dbrgn commented Dec 9, 2019

I've found the TypeScript type guards to be a very useful concept, especially when dealing with structured data coming in over the network (e.g. a JSON API).

This is the TS syntax:

function isNumber(x: any): x is number {
    return typeof x === "number";
}
function isFish(pet: Fish | Bird): pet is Fish {
    return (pet as Fish).swim !== undefined;
}

That syntax is probably not feasible in Python (I don't think pet is Fish can be used as return type annotation, but can't find the definition for TYPE_COMMENT in the grammar).

Maybe a placeholder type (def is_fish(pet: Union[Fish, Bird]): TypeAssertion) could be used, which would tell the type checker to "inline" the type hint function when determining the type refinement in an if statement (or another similar control structure)?

Maybe TypeAssertion[pet=Fish] could work as well? Depends on the grammar of course.

@JelleZijlstra
Copy link
Member

Grammatically type annotations are just expressions, so something like this:

def is_number(x: object) -> x is int:
    pass

is grammatically valid. However, it has some possible problems:

  • The name x is obviously not necessarily defined at this point, so if you try this right now you'll likely get NameError: x is not defined. This is not a problem if from __future__ import annotations is on, because then Python will not attempt to evaluate the annotation.
  • The is operator cannot be overloaded at runtime, so it's going to be hard to provide runtime introspection of this sort of annotation, although again this is mitigated by from __future__ import annotations.

In contrast, your TypeAssertion[x=int] syntax is invalid at runtime, so it would be much harder to add than -> x is int.

In any case, the bigger difficulty for getting something like this into mypy is going to be first convincing people that it's a good idea and second getting an implementation written and merged into mypy. Syntax is easy. :)

@dbrgn
Copy link

dbrgn commented Dec 9, 2019

first convincing people that it's a good idea

What would be a reason against this (besides implementation complexity)? It considerably helps when dealing with external data, because it combines runtime validation and static type checks.

@DustinWehr
Copy link

DustinWehr commented Dec 26, 2019

@JelleZijlstra would implementation be easy too? Could you not typecheck a call is_number(e) to

def is_number(x) -> x is int:  # or whatever syntax makes the cut
    pass

by just virtually (or literally) replacing it with (is_number(e) and isinstance(e,int))?

@DustinWehr
Copy link

DustinWehr commented Dec 26, 2019

How about a bundled do-nothing decorator for the syntax?

@mypypred(t=int)
def is_number(x) -> bool:
    pass

@lubieowoce
Copy link

by just virtually (or literally) replacing it with (is_number(e) and isinstance(e,int))

@DustinWehr a literal isinstance() wouldn't work bc it can't be used with e.g. generic types (as they're erased at runtime). but i guess the machinery used for cast() could be reused here

@DustinWehr
Copy link

@lubieowoce I would've guessed the static analyzer's use of isinstance isn't encumbered by that runtime constraint. I haven't looked at the code to see the approach taken though. It seems to understand at least simple boolean combinations of isinstance expressions (e.g. isinstance(x,T) and not isinstance(x,T2)). Maybe they're doing a brute force enumeration of the possible type narrowings along each path (approach I would implement first), or something more complex like symbolic execution.

@DustinWehr
Copy link

DustinWehr commented Dec 30, 2019

I see that this was already suggested in #7870 by @gantsevdenis. Good idea imo.

How about a bundled do-nothing decorator for the syntax?

@mypypred(t=int)
def is_number(x) -> bool:
    pass

@lubieowoce
Copy link

lubieowoce commented Dec 31, 2019

@DustinWehr i guess i was being a bit pedantic :) btw if you want to take a look, the logic for isinstance checks seems to start in mypy.checker.find_isinstance_check:

mypy/mypy/checker.py

Lines 3798 to 3814 in 9101707

def find_isinstance_check(self, node: Expression
) -> Tuple[TypeMap, TypeMap]:
"""Find any isinstance checks (within a chain of ands). Includes
implicit and explicit checks for None and calls to callable.
Return value is a map of variables to their types if the condition
is true and a map of variables to their types if the condition is false.
If either of the values in the tuple is None, then that particular
branch can never occur.
Guaranteed to not return None, None. (But may return {}, {})
"""
if_map, else_map = self.find_isinstance_check_helper(node)
new_if_map = self.propagate_up_typemap_info(self.type_map, if_map)
new_else_map = self.propagate_up_typemap_info(self.type_map, else_map)
return new_if_map, new_else_map

Maybe they're doing a brute force enumeration of the possible type narrowings along each path

i think that's indeed the case if i'm understanding you correctly. in general

if isinstance (x, A):
  # yes
  ...
else:
  # no
  ...

seems to be handled with two different typings of variables, one for the yes branch and one for the no branch. if the condition is isinstance(x, A) or isinstance(x, B), the two "type narrowings" given by the isinstance checks are combined, giving (more or less) x: Union[A, B] etc. see:

mypy/mypy/checker.py

Lines 3953 to 3971 in 9101707

elif isinstance(node, OpExpr) and node.op == 'and':
left_if_vars, left_else_vars = self.find_isinstance_check_helper(node.left)
right_if_vars, right_else_vars = self.find_isinstance_check_helper(node.right)
# (e1 and e2) is true if both e1 and e2 are true,
# and false if at least one of e1 and e2 is false.
return (and_conditional_maps(left_if_vars, right_if_vars),
or_conditional_maps(left_else_vars, right_else_vars))
elif isinstance(node, OpExpr) and node.op == 'or':
left_if_vars, left_else_vars = self.find_isinstance_check_helper(node.left)
right_if_vars, right_else_vars = self.find_isinstance_check_helper(node.right)
# (e1 or e2) is true if at least one of e1 or e2 is true,
# and false if both e1 and e2 are false.
return (or_conditional_maps(left_if_vars, right_if_vars),
and_conditional_maps(left_else_vars, right_else_vars))
elif isinstance(node, UnaryExpr) and node.op == 'not':
left, right = self.find_isinstance_check_helper(node.expr)
return right, left

i imagine that a hacky proof of concept impl of type guards could simply add an elif is_typeguard_expr(node): .... i might try adding it later

@vbraun
Copy link

vbraun commented Jul 30, 2020

Repeating the name x of the argument of the type guard isn't really necessary, the syntax could just be

def is_integer(x: Any) -> IsInstance[int]   # or TypeAssertion[int] or TypeGuard[int]
    pass

On a related note, a still-open typescript feature request for is to extend type guards to multiple arguments. It would be nice to cover that use case as well, e.g.

def is_integer_and_string(x: Any, y: Any) -> IsInstance[int, str]
    pass

@wyfo
Copy link
Contributor

wyfo commented Feb 14, 2021

Why do we need a boolean return?

A direct solution to this issue would be to implement type-guard functions to return the checked argument, like a checked cast in fact:

def as_integer(x) -> int:
    if not isinstance(x, int):
        raise ValueError("not an int")
    return x

If-else logic can simply be achieved using try-catch-else blocks.

try:
    checked_x = as_integer(x)
except ValueError:
    ...
else:
    ...  # use checked_x as an integer in this block

This checked-cast functions can also be used as expression, which is convenient for case like expect_an_int(as_integer(x)) (but with risks of exception mix up between expect_an_int and as_integer).

Checked-cast functions could also return an Optional in order to avoid exception (but that's maybe not suited for some cases)

def as_integer(x) -> Optional[int]:
    return x if isinstance(x, int) else None

if (checked_x := as_integer(x)) is not None :
    ...
else:
    ...

Moreover, if several parameters needs to be type-checked, a tuple returns can do the job:

def check_int_and_str(x, y) -> tuple[int, str]:
    if not isinstance(x, int) or not isinstance(y, str):
        raise ValueError("bad types")
    return x, y

checked_x, checked_y = check_int_and_str(x, y)

But yes, this solution imply to assign an additional variable (with an additional name), so it's heavier, but it already works out of the box and do the job, no PEP required.

@wyfo
Copy link
Contributor

wyfo commented Feb 14, 2021

That's being said, if boolean type-guard functions have to be implemented in the language (and I would be happy to use them to replace my heavier checked-cast), why not using PEP 593 Annotated?
By adding a standard type annotation (for Annotated), one could write something like

from typing import Annotated, TypeGuard

def is_integer(x) -> Annotated[bool, TypeGuard(int, "x")]:  # map the type-guard to the corresponding parameter
    return isinstance(x, int)

That could allow type-guarding of several parameters:

def check_int_and_str(x, y) -> Annotated[bool, TypeGuard(int, "x"), TypeGuard(str, "y")]:
    return isinstance(x, int) and isinstance(y, str)

Using a PEP 593 type annotation instead of a whole new type has the advantage of not impacting every tools using type annotation (as metadata can just be ignored).
This is also exactly the purpose of PEP 593 as it states:

a type T can be annotated with metadata x via the typehint Annotated[T, x]. This metadata can be used for either static analysis or at runtime.

The type-guard is indeed a metadata of the function result, but the function still returns a bool.

But yes, this solution seems to be a little bit heavier than @vbraun proposal or PEP 647, but nothing prevent the specification and implementation of my proposal to provide the following shortcuts:

  • when no parameter name is passed to TypeGuard, i.e. TypeGuard(int), it applies to the first parameter (or the second in case of a method)
  • TypeGuard has a __getitem__ method which gives the following result: TypeGuard[T] == Annotated[bool, TypeGuard(T)]

A simple implementation would be:

class TypeGuard:
    def __init__(self, tp, param=None):
        self.tp = tp
        if param is not None and not isinstance(param, str):
            raise TypeError("Type guard parameter mapping must be a string")
        self.param = param

    def __getitem__(self, item):
        return Annotated[bool, TypeGuard(item)]

It would then possible to write

def is_integer(x) -> TypeGuard[int]: ...
# which would give in fact `def is_integer(x) -> Annotated[bool, TypeGuard(int)]`
# which would thus be equivalent to `def is_integer(x) -> Annotated[bool, TypeGuard(int, "x")]`

As easy, but more powerful (support arbitrary parameters), and again, less complexity (no additional SpecialForm), less impact on existent tools.

@gvanrossum I've discovered PEP 647 while writing this comment; is it too late to add this proposal using PEP 593 to it ? (or to debate if the checked casts of my previous comment are enough or not)

@gvanrossum
Copy link
Member

Let’s keep the discussion in one place, typing-sig.

@JelleZijlstra
Copy link
Member

Mypy now supports TypeGuard from PEP 647. Closing.

@espetro
Copy link

espetro commented Oct 4, 2021

Hi everyone! I've created a pretty simple library for refinement types, by making use of PEP 647 (Type guards) and PEP 593 (Annotated types).

Currently, it only works for functions with a @refined decorator, but I'd love to get it to support mypy and other type checkers. Contributions are more than welcome! 😊 🍻

@antonagestam
Copy link
Contributor

@espetro You might be interested in phantom-types, which works with pre-TypeGuard type guards, e.g. isinstance(). It's also heavily inspired by fthomas/refined. Feel free to reach out in the project's Discussions if you're interested in collaborating! You can see a quick example of how it works in the Getting Started section.

* end of shameless self-plug, sorry for off topic *

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests