Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of typing.Union[XYZ, None] #6333

Merged
merged 23 commits into from
Sep 12, 2024
Merged

Conversation

matusvalo
Copy link
Contributor

@matusvalo matusvalo commented Aug 8, 2024

This PR adds support for alternative definition of optional [1] type.

Partially implements #6254

[1] https://docs.python.org/3/library/typing.html#typing.Optional

Comment on lines 1280 to 1282
def analyse_as_type(self, env):
return py_none_type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this allow code like cdef None x or x: None = 25 ? It seems to me that Optional[] or Union[] might just be special enough to specifically look for None in their type arguments rather than treating None as a type generally. None really isn't a type, it's a value that is allowed by the type system for practical syntax purposes (because writing NoneType would be awkward).

Copy link
Contributor Author

@matusvalo matusvalo Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this allow code like cdef None x or x: None = 25 ?

cdef None is not allowed even in this PR - I can add test for that if needed:

(cython311) root@c113a6388b51:~/dev/cython/test# cat aa.pyx
cdef None x
(cython311) root@c113a6388b51:~/dev/cython/test# cython -3 aa.pyx

Error compiling Cython file:
------------------------------------------------------------
...
cdef None x
     ^
------------------------------------------------------------

aa.pyx:1:5: 'None' is not a type identifier

And x: None is already allowed in master (I suspect it fallbacks to object)

It seems to me that Optional[] or Union[] might just be special enough to specifically look for None in their type arguments rather than treating None as a type generally.

We need to have None type here to distinguish two cases:

  1. None value in type definition (e.g. Union[None, ExtentionType]) and
  2. unknown type (e.g. Union[ExtentionType, NonExistingType]) - in this case we are having None value in specialize_here() function:

https://github.com/cython/cython/pull/6333/files#diff-7f63ffd84e67d3d113d7ad10fa110111c779a8ca5edd7e12dc51097b36405eabR4773-R4785

None really isn't a type

I don't fully agree. There is None type but is not part of public CAPI [1] of python (and this PR it uses just internal representation of this type - not using python CAPI):

>>> type(None)
<class 'NoneType'>

In general, I am using None type only to distinguish None in type from unknown type. (Moreover it makes sense to me to use type for None) If there is better/other solution I am fine to change the approach.

[1] https://docs.python.org/3/c-api/none.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we find None in code, we don't replace it with None as value, but with a NoneNode. That makes it easy to distinguish from the value None (as in "no node").

As you correctly wrote, the Python type is called NoneType, not None. The fact that Python uses None in the type system instead is simply that for users, typing NoneType is very unusual. Many programmers probably have never seen it but know None only as a constant value. Thus, None is ubiquitous and easy to read and understand as part of a type annotation – it allows None as a value, thus making the typed value optional.

However, that doesn't mean that None is a type, or that we have to represent it as a type in Cython. Just check the type declaration node for node.is_none to see if we found None in the source code.

Copy link
Contributor Author

@matusvalo matusvalo Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, that doesn't mean that None is a type, or that we have to represent it as a type in Cython. Just check the type declaration node for node.is_none to see if we found None in the source code.

The core of this PR is following part:

    def specialize_here(self, pos, env, template_values=None):
        if py_none_type in template_values:
            self.contains_none = True

https://github.com/cython/cython/pull/6333/files#diff-7f63ffd84e67d3d113d7ad10fa110111c779a8ca5edd7e12dc51097b36405eabR4782-R4784

Here, we are detecting whether annotation contains None (E.g. Union[XYZ, None]). I double checked it and without introducing NoneType, this code gets literal None value for both cases:

  • Union[int, None]
  • Union[int, DummyType]

Hence, without further changes I cannot distinguish following cases:

  1. non-recognized type - in this case we should evaluate annotation as object
  2. None value - in this case the type is candidate for evaluation of optional type - e.g. Union[]

Note

To replicate it, just use master and put debug print/breakpoint in following line:

def specialize_here(self, pos, env, template_values=None):

Both cases

import typing

def main(a: typing.Optional[Bar]):
    pass

and

import typing

def main(a: typing.Optional[None]):
    pass

contains value [None] in variable template_values.

Hence, when I want to avoid introducing NoneType we need to introduce major structural changes how annotation is evaluated at least in following function (which feeds specilize_here():

def _analyse_template_types(self, env, base_type):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to introduce major structural changes how annotation is evaluated at least in following function

Or implement the PR in totally different way, but in that case I have no idea how...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code gets literal None value for both cases:

* `Union[int, None]`

* `Union[int, DummyType]`

Then we need to make both distinguishable. A way to do that is to introduce NoneType as type and replace a literal None in type annotations with it.

I'm unsure whether that also applies to the func() -> None return type case, where the actual meaning is "has no return value" rather than "returns the constant value None". But it definitely applies to type declarations like the above.

Is there a reason, BTW, why we cannot represent Union[xyz, None] as Optional[xyz], which we already support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we need to make both distinguishable. A way to do that is to introduce NoneType as type and replace a literal None in type annotations with it.

Yes, I think this PR is doing pretty that.

I'm unsure whether that also applies to the func() -> None return type case, where the actual meaning is "has no return value" rather than "returns the constant value None". But it definitely applies to type declarations like the above.

I am not sure if it is important in context of Cython. In this PR Cython generates function which returns PyObject * value which is correct (I suppose)

Is there a reason, BTW, why we cannot represent Union[xyz, None] as Optional[xyz], which we already support?

Honestly, the only reason is to more complete support of PEP 484.

Cython/Compiler/PyrexTypes.py Outdated Show resolved Hide resolved
@da-woods
Copy link
Contributor

One observation from investigating the none-type bool issue. The following is currently an error on master:

import cython

@cython.cfunc
def f() -> None:
    pass

I suspect that changes on with this PR. I'm not yet quite sure exactly how it should change but it's probably worth considering.

@da-woods
Copy link
Contributor

So the specific failure that __bool__ fixes is:

def test_fused_memslice_dtype(cython.floating[:] array):
    # ...
    if array is None:
        print(cython.typeof(array))
        return

That's obviously intended to be a none test, but is being interpreted as "if array is the type given by None".

I think the easiest fix is to change NoneNode.analyse_as_type to:

def analyse_as_type(self, env):
        if not env.in_c_type_context:
            return py_none_type

so it only gets evaluated like this in annotation typing sort of environments.


I'd declare this with all the other builtin types in Cython/Compiler/Builtins.py.

I retract this comment based on what @scoder says, because it doesn't really match NoneType (the Python object). I think that's just a case of naming it something to make the intent clearer though.

@matusvalo
Copy link
Contributor Author

matusvalo commented Aug 29, 2024

I suspect that changes on with this PR. I'm not yet quite sure exactly how it should change but it's probably worth considering.

Yes indeed this PR allows None as return type in cdef functions. I suppose this is correct since we already allows int return type:

@cython.cfunc
def f() -> int:
    return

I checked the generated C code and the generated function returns PyObject which is correct so I consider it as a bonus improvement. There is one catch though, this PR does not allows this:

@cython.cfunc
def f() -> None:
    return

But this is easy fix we need to change following line:

if not return_type:

to

 if return_type is None: 

(or maybe the fix from comment #6333 (comment) will fix it - not sure right now)

@matusvalo
Copy link
Contributor Author

matusvalo commented Sep 1, 2024

I think the easiest fix is to change NoneNode.analyse_as_type to:

It seems to work. The only side-effect is that we are returning different error for following case:

def union_pytypes(Union[int, None] i):
    pass

Will error with 'object' takes exactly one template argument. (Since None is analysed under C context).

Where by

def union_pytypes(Optional[int] i):
    pass

Will error with 'Optional' is not a type identifier.

Not huge problem since Optional/Union is mainly for type annotation.

@scoder
Copy link
Contributor

scoder commented Sep 6, 2024

Not huge problem since Optional/Union is mainly for type annotation.

I'd say "only" rather than "mainly". It feels wrong to allow them as C style type declarations at all.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced that we need to add NoneType as a type. The allows_none() method could just return true for Optional and otherwise run through the template arguments to see if any of the nodes has node.is_none. Isn't that enough?

Cython/Compiler/ExprNodes.py Outdated Show resolved Hide resolved
Cython/Compiler/ExprNodes.py Outdated Show resolved Hide resolved
tests/run/annotation_typing.pyx Show resolved Hide resolved
tests/run/ext_type_none_arg.pyx Show resolved Hide resolved
Cython/Compiler/PyrexTypes.py Outdated Show resolved Hide resolved
Cython/Compiler/PyrexTypes.py Outdated Show resolved Hide resolved
tests/errors/e_typing_errors.pyx Outdated Show resolved Hide resolved
Cython/Compiler/Nodes.py Outdated Show resolved Hide resolved
@scoder
Copy link
Contributor

scoder commented Sep 6, 2024

Regarding the general approach, Union[…, None] is really redundant with Optional[…]. If we wanted to avoid implementing both, we could apply the following two rules:

  • transform Union[…, None] into Optional[Union[…]]
  • transform Union[X] into plain X

Or the other way round, transforming Optional[…] into Union[…, None]. Whichever is simpler to evaluate afterwards.

I'm not saying that we should do this since I can't say what overall effects this has. But I wanted to spell it out as a possible alternative since we already kind-of mentioned it above.

@TeamSpen210
Copy link
Contributor

At runtime, Optional[X] just does Union[X, NoneType], then Union has a special case in the __repr__. Union[X] becomes X, duplicates are stripped, and any nesting is flattened out. The only thing that's kept is the order, since it could change the order runtime checkers do checks in. Otherwise they're indistinguishable. Aside from evaluation order, exceptions, monkeypatching it shouldn't matter if Cython did some normalisation.

@matusvalo
Copy link
Contributor Author

transform Union[…, None] into Optional[Union[…]]

this PR is already doing it - see:

https://github.com/cython/cython/pull/6333/files#diff-3890ebbab0f58440a0c1bf6e456d44c913fc52b535176394c521446193087a56R3997

The core idea of this PR is to

  1. detect annotation Union[XYZ, None] / Union[None, XYZ]
  2. extract nested type XYZ from annotation
  3. continue with analysis with type XYZ and typing.Optionalmodifier.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing me to the Union -> Optional transformation. I had missed that.

Comment on lines 3995 to +3999
while modifier_node.is_subscript:
modifier_type = modifier_node.base.analyse_as_type(env)
if (modifier_type and modifier_type.python_type_constructor_name
and modifier_type.modifier_name):
modifiers.append(modifier_type.modifier_name)
modifiers.append('typing.Optional' if modifier_type.allows_none() else modifier_type.modifier_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a user writes Optional[Union[int, None]]? We'd probably get a duplicate Optional in that case, right? Although I'd guess that that won't even hurt us…

@matusvalo
Copy link
Contributor Author

I'm still not convinced that we need to add NoneType as a type. The allows_none() method could just return true for Optional and otherwise run through the template arguments to see if any of the nodes has node.is_none. Isn't that enough?

This still does not solve the problem that cython internally uses literal None value to represent both cases:

  • None value in template argument - e.g. Union[MyExtType, None]
  • unkown type in template argument - e.g. Union[MyExtType, SomeDummyType]

Without NoneType the two mentioned cases cannot be distinguished. If we don't like new NoneType we can introduce sentinel for unkown types - e.g.

  • new class UnknownType or
  • a singleton object, or
  • other special value (different to None of course)

which will be used in case cython cannot find a type inside the annotation. Not sure what is better solution.

@da-woods
Copy link
Contributor

This still does not solve the problem that cython internally uses literal None value to represent both cases:

I don't think this is right. I think it only uses the None literal to represent an unknown type. Currently NoneNode.analyse_as_type returns None but that's only supposed to represent "unknown".

otherwise run through the template arguments to see if any of the nodes has node.is_none. Isn't that enough?

I think this suggestion is saying:

In TemplatedTypeNode._analyse_template_types, in:

for template_node in self.positional_args:

You could just say if template_node.is_none to identify if it's a NoneNode. In that case you don't have to do template_node.analyse_as_type and so you don't need a sentinel value.

@matusvalo
Copy link
Contributor Author

OK my apologies! Now I got the point of @scoder after reading hint by @da-woods ! It's a huge facepalm for me that I was not able to see it before! It simplified the code by a factor too!

I checked locally the tests and the changes seemed to be OK. I will wait what CI will tell. Big thank you to @da-woods in the meanwhile...

@scoder
Copy link
Contributor

scoder commented Sep 12, 2024

Thanks, yes, that makes the implementation much shorter.

@scoder scoder merged commit c525955 into cython:master Sep 12, 2024
68 checks passed
@matusvalo matusvalo deleted the union_none branch September 12, 2024 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants