Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding complex type aliases #2029

Closed
ebolyen opened this issue Aug 16, 2016 · 5 comments
Closed

Question regarding complex type aliases #2029

ebolyen opened this issue Aug 16, 2016 · 5 comments

Comments

@ebolyen
Copy link

ebolyen commented Aug 16, 2016

Hey MyPy team! (Sorry in advance for what is a long post.)

I'm one of the developers on the QIIME 2 project which is using type annotations to perform runtime-coercion and wanted to touch base and see if our use of annotations are consistent with the vision of the MyPy team. Ideally we would want our plugin developers to be able to use MyPy without needing to change the annotations they have written for QIIME.

Our goal is to be able to write something like this:

def foo(example: FilePath[ExampleFileFormat], other: some.ObjectView) \
        -> some.OtherView:
    ...

somewhere.register(foo, ...)

Where our framework then interprets the __annotations__ of foo at runtime and will coerce what we call "Artifacts" (essentially glorfied zip-files which have semantic types, data provenance, etc...) into the view/model which is requested by foo, in this case a filepath as a str which points to a file formatted as ExampleFileFormat.

This has a lot of really powerful implications for our end-users and our plugin-developers (which I can expand on if requested), however as we are using annotations to express the required view, we want to make sure we are using them in a way that is consistent with MyPy.

We are already able to do this with "simplistic" view that are just Python objects, but we are looking to extend this capacity to support on-disk representations (bioinformatics has an absurd number of them). This is especially necessary as we intend to make it simple to wrap existing tools which can only interface with any number of arbitrary file formats.

As I was working on this, I noticed the third paragraph of what-about-existing-uses-of-annotations in PEP 484 which gives me pause. While we think that our use of annotations is consistent with the spirit of the PEP, we are worried that our type alias is closer to an expression under that language. If that is the case, we have no problems dropping our use of __annotations__ and just using some decorator syntax, but we would save a lot of headache if that was known now.

I read through the discussion on adding NewType in issue #1284. (congrats on getting that in, btw!) and was able to cannibalize some of the ideas presented to create a class which MyPy seems type-check and which holds our "extended" information about the on-disk representation (in this snippet, the __pattern__):

from typing import Generic, TypeVar, cast, GenericMeta

T = TypeVar('T')

class ResourceMeta(GenericMeta):
    _registry = {}

    def __new__(cls, name, bases, dct, pattern=None):
        # I assume a singleton factory is necessary as MyPy and Python are 
        # nominally typed, so the same expression should yield the same
        # instance
        if pattern not in cls._registry:
            # Just return the value outright at runtime (should be str)
            dct['__new__'] = lambda cls, value: value
            cls._registry[pattern] = super().__new__(cls, name, bases, dct)

        return cls._registry[pattern]

    def __init__(self, name, bases, dct, pattern=None):
        # The particular resource pattern that we want to keep track of
        # for runtime coercion
        self.__pattern__ = pattern

    def __getitem__(self, pattern):
        return self.__class__(self.__name__, self.__bases__,
                              dict(self.__dict__), pattern=pattern)


class FilePath(str, Generic[T], metaclass=ResourceMeta):
    pass


class ExampleFileFormat:
    pass

Unfortunately I don't know nearly enough about how MyPy works internally to really appreciate how wrong this likely is. But it does seem to "work" in that at runtime the variables are of type str while the __annotations__ retain the information we need to convert the source data into what is expected by the function, and MyPy seems satisfied so long as the values are constructed via FilePath.

I guess ultimately my question boils down to, where do you see the future of type annotations going, and how extensible do you expect it to be? Right now it seems relatively closely coupled to the expectations of MyPy. Will complex type-aliases (or other type-hints which hold more information than a type-checker would know to care about) be a "supported" feature, or is this really just a hack which could break at any time (assuming it isn't already broken in subtle ways)?

Thanks for reading!

@gvanrossum
Copy link
Member

Thanks for writing! I think the tl;dr is "please don't do that". I'll try to elaborate a bit, but feel free to ask more questions -- I'm not sure I've completely fathomed what you're proposing.

The key concept here is that mypy has fairly fixed notions of what it accepts when it is checking the code, and that the typing module provides runtime support that makes it so that whatever syntax mypy accepts (e.g. List[int]) is actually valid at runtime.

Leaving annotations in __annotations__ for runtime inspection is mostly a pre-existing feature (dating back to PEP 3107), and I think if we had been designing the syntax and runtime behavior of annotations at the same time as PEP 484, we probably would not have had __annotations__ at all.

Now that we have it, we make do as well as we can, but there are many PEP 484 features that leave odd things in __annotations__, e.g. forward references, type variables and unions.

One thing that I am firmly committed to is that mypy is optional, and if you don't run mypy over your code, you can do whatever you want in your annotations. But if you have things that mypy doesn't natively understand, you should probably use the PEP 484 @no_type_check decorator on your class or function (or use @no_type_check-decorator, although mypy doesn't support it yet). Or you can put # type: ignore at the top of the file, a bigger hammer (also not yet supported but on the books).

What you're doing in the snippet is relying on the fact that mypy (currently) doesn't look at the metaclass used for generic classes, but that could change at any time. (We don't have an issue that plans to do so, but there's nothing in PEP 484 that says a type checker couldn't try to thoroughly understand how metaclasses are used, and it might well flag this as a problem.)

I think that final bit is your answer. What you should do at this point depends on whether you expect to get some good use out of mypy or whether you'd rather just continue using annotations to control your framework at runtime. Trying to do both seems problematic.

@ebolyen
Copy link
Author

ebolyen commented Aug 17, 2016

The key concept here is that mypy has fairly fixed notions of what it accepts when it is checking the code, and that the typing module provides runtime support that makes it so that whatever syntax mypy accepts (e.g. List[int]) is actually valid at runtime.

This makes my explorations of source code for the typing module make so much more sense! I spent a lot of time trying to understand how MyPy interacted with that module so that I might leverage that machinery as well, but it doesn't (for obvious reasons in retrospect).

One thing that I am firmly committed to is that mypy is optional, and if you don't run mypy over your code, you can do whatever you want in your annotations. But if you have things that mypy doesn't natively understand, you should probably use the PEP 484 @no_type_check decorator on your class or function (or use @no_type_check-decorator, although mypy doesn't support it yet). Or you can put # type: ignore at the top of the file, a bigger hammer (also not yet supported but on the books).

This makes a lot of sense, and assuming we didn't go with a decorator based syntax, this is probably the correct thing to do.

What you're doing in the snippet is relying on the fact that mypy (currently) doesn't look at the metaclass used for generic classes, but that could change at any time. (We don't have an issue that plans to do so, but there's nothing in PEP 484 that says a type checker couldn't try to thoroughly understand how metaclasses are used, and it might well flag this as a problem.)

Thank you for explaining this, it makes sense and it would be a reasonable thing to expect a future type-checker to consider (I can't even fathom how it would reason about the constructed class, but if it could, it would be very powerful).

Before I drop this line of reasoning for good I would like to make a final argument (or comparison rather):

Currently there is NewType which allows us to alias semantic concerns to a data-type, the canonical example being NewType('UserId', int). The meaning of UserId is irrelevant to MyPy, only the "sub-typed-ness" of the alias is considered (if I understood the PEP correctly).

What I need is essentially the same thing, but with a slightly with more complex grammar for the alias. As a terrible example off the top of my head: NewType('FilePath', str, generic=TypeVar('T')) such that FilePath[<variant of T>] becomes a valid alias for str and follows the same identity rules as a normal alias. Granted this is only a useful concept if something other than MyPy was also consuming the annotation. In our case this would hint to our framework to convert a file of a different known format into the format defined by the annotation, transparently to the annotated function. In this way the annotation becomes a really powerful thing for the function to use, as it gets more than just static analysis in that trade.

In the spirit of "if you give a mouse a cookie": you have simple type aliases, but now I need/want type aliases of arbitrary composite types (or even just generic types would be fine).

I of course completely understand if this it out of scope for MyPy.

@ebolyen
Copy link
Author

ebolyen commented Aug 17, 2016

In a related idea, one might also imagine parametrized type aliases, where a TypeVar is shared between the alias and the supertype, but dealing with the variance of the type variable sounds kind of painful in that context.

@gvanrossum
Copy link
Member

It looks like your need/proposal is just about the opposite of NewType: With UID = NewType('UID, int) we define a new type that's a subclass of int from mypy's POV but a plain int at runtime. But what you want is something that is (say) a plain str to mypy but has an extra adornment at runtime.

That's not an entirely unreasonable thing to ask for, but it requires a discussion in the tracker for PEP 484: https://github.com/python/typing/ I will close this isue, but feel free to open a new issue there, referencing the discussion here. (Though it would be good to describe what you're asking without the context of the discussion here, since your first post was based on an incomplete understanding of the relationship between mypy and typing.py.)

parametrized type aliases, where a TypeVar is shared between the alias and the supertype

Honestly I cannot even parse that phrase. :-( Also something to bring up on the PEP 484 tracker, perhaps in a separate issue? Or forget about it. :-)

@ebolyen
Copy link
Author

ebolyen commented Aug 18, 2016

Thanks @gvanrossum! This has been a really helpful conversation for me and I will work on creating a more specific and detailed request on the PEP 484 issue tracker!

Sorry about the hard to parse comment, it was a random thought that occurred to me afterwards, but it isn't actually relevant to anything I need anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants