Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EmailMessage missing from email.parser #2417

Open
jrideout opened this issue Aug 25, 2018 · 8 comments
Open

EmailMessage missing from email.parser #2417

jrideout opened this issue Aug 25, 2018 · 8 comments
Labels
stubs: false positive Type checkers report false errors

Comments

@jrideout
Copy link

If policy.default is set, then methods like email.message_from_file will return EmailMessage rather than just Message

@srittau srittau added stubs: false positive Type checkers report false errors size-small labels Nov 19, 2018
@srittau
Copy link
Collaborator

srittau commented Nov 19, 2018

We have basically three options here:

  • Return Union[Message, EmailMessage]. This does not really improve the situation, since an isinstance() is still necessary when EmailMessage is returned.
  • Return just EmailMessage. This can give false positives, but this is probably preferable to false negatives.
  • Return Any. In this case we lose all advantages of type checking.

Personally, I prefer the second option.

@jrideout
Copy link
Author

jrideout commented Nov 21, 2018

What about overloads?

I think this handles the more common cases:

T = TypeVar('T', Message, EmailMessage)

@overload
def message_from_string(s: str, _factory: Callable[..., T]) -> T: ...

@overload
def message_from_string(s: str) -> Message: ...

@overload
def message_from_string(s: str, policy: EmailPolicy) -> EmailMessage: ...

@jrideout
Copy link
Author

FWIW, I did go with your second option in my most recent project (which forked email for our own reasons). Our email/__init__.pyi looks like:

def message_from_string(s: str, policy: EmailPolicy) -> EmailMessage: ...
def message_from_bytes(s: bytes, policy: EmailPolicy) -> EmailMessage: ...
def message_from_file(fp: IO[str], policy: EmailPolicy) -> EmailMessage: ...
def message_from_binary_file(fp: IO[bytes], policy: EmailPolicy) -> EmailMessage: ...

@srittau
Copy link
Collaborator

srittau commented Nov 21, 2018

Overloads work fine for when a _class is provided, but will not work for policy, until literal types (python/mypy#3062) are implemented. (Your third overload needs an asterisk before the last option, though.)

@jrideout
Copy link
Author

jrideout commented Nov 21, 2018

The overloads proposed don't actually capture the fact that policies themselves can have arbitrary factories which would require literal types. But we are choosing between imperfect options here. Just returning EmailMessage across the board will also be wrong for many use cases. How can we minimize that for likely real usage, while literal types are still in progress?

I took a quick pass at seeing if the override approach was feasible at all with a semi-contrived simplification. My intuition is that, in practice, users of these APIs will likely specify the _class or policy but not both.

from typing import Callable, TypeVar, Union, Type, overload, Any, Optional

class Message:
    def __init__(self, s: str) -> None :
        self.s = 'm' + s

class EmailMessage(Message):
    def __init__(self, s: str) -> None :
        self.s = 'em' + s

class Policy:
    factory: Type = Message

class EmailPolicy(Policy):
    factory: Type = EmailMessage

compat = Policy()
T = TypeVar('T')

@overload
def message_from_string(string: str, factory: Optional[Callable[..., T]], *args: Any) -> T: ...

@overload
def message_from_string(string: str, *args: Any) -> Message: ...

@overload
def message_from_string(string: str, *args: Any, policy: EmailPolicy) -> EmailMessage: ...

def message_from_string(string: Any, factory: Optional[Callable[..., T]] = None, *args: Any, policy: Policy = compat) -> Union[EmailMessage, Message, T]:
    f = factory or policy.factory
    return f(string)


def build_msg(string: str) -> Message:
    return Message(string)

def build_msg2(string: str) -> EmailMessage:
    return EmailMessage(string)

def build_msg3(string: str) -> bytes:
    return string.encode()


eml = message_from_string('abc')                         # Revealed type is 'test_types.Message'
eml2 = message_from_string('abc', build_msg)             # Revealed type is 'test_types.Message*'
eml3 = message_from_string('abc', build_msg2)            # Revealed type is 'test_types.EmailMessage*'
eml4 = message_from_string('abc', Message)               # Revealed type is 'test_types.Message*'
eml5 = message_from_string('abc', policy=EmailPolicy())  # Revealed type is 'test_types.EmailMessage'
eml6 = message_from_string('abc', build_msg3)            # Revealed type is 'builtins.bytes*'
eml7 = message_from_string('abc', build_msg3, 'other')   # Revealed type is 'builtins.bytes*'

eml8 = message_from_string('abc', policy=Policy())
# error: No overload variant of "message_from_string" matches argument types "str", "Policy"
# Revealed type is 'Any'

These results seem sane to me, though I admit I don't understand what the * suffix means. If we hold to the assumption that an explicitly set policy will be an EmailPolicy that produces EmailMessage, then all the other cases work. This fails when that assumptions fails, but seems to be to have small false-positive surface area. And to that question: what is the general policy around this? Err on the side of underspecification over incorrectness? Have a good balance of actual use? I'm happy to help drive this toward whatever standard is in use.

@JelleZijlstra
Copy link
Member

The * suffix in mypy means it's an inferred type, which doesn't really matter unless you're debugging mypy internals.

@andersk
Copy link
Contributor

andersk commented Jun 5, 2020

This problem could be solved by making Policy a generic type.

MessageT = TypeVar("MessageT")
class Policy(Generic[MessageT]): ...

class Message: ...
class Compat32(Policy[Message]): ...
compat32 = Compat32()

class EmailMessage: ...
class Default(Policy[EmailMessage]): ...
default = Default()

def message_from_string(
    string: str,
    factory: Optional[Callable[[], MessageT]] = None,
    *,
    policy: Policy[MessageT] = compat32,
) -> MessageT:
    ...
...

Unfortunately, this is another case where mypy’s insufficient support for generic defaults (python/mypy#3737) gets in the way. But we can work around that bug using overloads.

@overload
def message_from_string(
    string: str,
    factory: Optional[Callable[[], Message]] = None,
) -> Message:
    ...
@overload
def message_from_string(
    string: str,
    factory: Optional[Callable[[], MessageT]] = None,
    *,
    policy: Policy[MessageT],
) -> MessageT:
    ...
def message_from_string(
    string: str,
    factory: Optional[Callable[[], Any]] = None,
    *,
    policy: Any = compat32,
) -> Any:
    ...

@GideonBear
Copy link

Is there anything blocking this issue? Could I try to fix this? If yes, what approach should I take? Is making an overload that returns EmailMessage when policy: email.policy.EmailPolicy (the class of default) enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stubs: false positive Type checkers report false errors
Projects
None yet
Development

No branches or pull requests

5 participants