-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array namespace #685
base: main
Are you sure you want to change the base?
Array namespace #685
Conversation
Signed-off-by: nstarman <nstarman@users.noreply.github.com>
Signed-off-by: nstarman <nstarman@users.noreply.github.com>
Signed-off-by: nstarman <nstarman@users.noreply.github.com>
029b759
to
10b9e36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nstarman. The protocol for the __array_namespace__
return value sounds good.
Can you explain the arange
change? Is it only because the return type is otherwise difficult for static type checkers to understand because there's no input array? Documentation wise it doesn't look good to me to change a function to a class - we should avoid changing how it's rendered in the html docs.
from .creation_functions import arange as ArangeCallable | ||
|
||
|
||
class ArrayAPINamespace(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The camelcase works out slightly unfortunately here, but I think it's fine. ArrayApiNamespace
or ArrayAPI_Namespace
are more readable but not consistent to the standard rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! Alternatively, we could drop the API
bit in the middle?
Then this looks good: __array_namespace__() -> ArrayNamespace: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could drop the API bit in the middle?
That sounds like the best idea here to me.
The change in class ArrayAPINamespace(Protocol):
arange: ArrangeCallable Vs class ArrayAPINamespace(Protocol):
@staticmethod
def arange(start: Union[int, float], /, stop: Optional[Union[int, float]] = None, step: Union[int, float] = 1, *, dtype: Optional[dtype] = None, device: Optional[device] = None) → array):
... Per the docs rendering, class types can be manipulated. At https://github.com/cosmology-api/cosmology.api/tree/3f7ae746a166201298ebd8a786249b58aefdfe3d/docs/_ext we have some deep foo doing signature manipulations (@ntessore wrote this). So it's doable to have Also, callback protocols means the following is possible >>> from numpy import arange
>>> from data_apis.array import arange as ArangeCallable
>>> isinstance(arange, ArangeCallable)
True
>>> def mycustomarange(...<correct signature>): ...
>>> isinstance(mycustomarange, ArangeCallable)
>>> True |
Hmm, I am not knowledgeable enough about the latest in static typing to understand the callback rationale here. I do see that in NumPy @BvB93 do you have an opinion about typing |
It's (unfortunately) more contrived than ideal, but the typing itself is perfectly solid. Just to give a bit more background: |
Thanks! And to confirm if I understood it right: we're going to need this method for all functions that don't take an array as an input (mostly array creation functions), and not for anything else, right? Or would we have to change all functions to classes with |
The latter I'm afraid, or at least for all functions that you'd like to include in the array namespace protocol. At best you might be able to reuse a single protocol multiple times for representing for different functions with identical signatures, but I imagine that this could cause issues with docstrings and such. |
Hmm. I wonder if it would be feasible to codegen type stubs here? If for every function |
I would suggest to do the reverse, to code gen the function from the class and use the code-gen'ed function in the docs. The call-back protocols are useful for type-checking: both statically and at runtime (see #685 (comment) for examples). My hope is to have this library be installable and useful for type-checking purposes and as ABCs (protocols are ABCs when used as a base class). Avoiding magic and meta-coding in the actual library will help in achieving that goal. For communication purposes in the docs I agree that a function is the best representation, hence code-gen the function for the docs, and the |
Once we have some parser for either automatically going from |
I think it ends up mattering in a few cases:
Composing that list, I guess it's the same for anyone using a compiled wheel, but for developers close to the code having to code-gen the type-correct classes from functions will be a pain and make the code effectively equivalent to being written in a compiled language a la https://xkcd.com/303. |
I'm not too worried about the users of the installed package, I'm thinking more from the point of view of working on the standard itself. In the end that is the primary purpose here: authoring a well-documented API standard. These things are functions in the standard after all, so having it all converted to classes with Also from a code reusability point of view: the current functions are idiomatic. You can copy them and fill in the body of each function in order to get a standard-compliant implementation.
This one is easy to fix, as is an in-place build when using an IDE (those are the same effectively) - you can run the codegen as part of the install.
I think this one may be the only issue, because Mypy is bad at running against an installed package. It'd be an extra one line to install the |
I second @rgommers opinion that we should continue authoring as functions, rather than as classes. Authoring as classes increases authoring complexity and just raises contribution barriers. I'd prefer to hide this complexity behind automation from functions to class conversion. |
I did some trial and error yesterday and managed to cobble a script together for automatically carrying out this
The biggest obstacle would probably be a setuptools >=64 regression that breaks the type checking of editable installations. There are workarounds for this though: python/mypy#13392 (comment). |
@BvB93 thanks for pointing out that editable install issue. I'll note that that will also affect |
Right, I did a little bit more reading and some trial & error; I think we might be in the clear here: the relevant issue seems to only apply when (a package A is installed in editable mode and (b package B imports from package A, with B now unable to see A's annotations. So this could be a potential downstream annoyance for packages that want access to the array api's annotation, but it shouldn't affect the array api repo itself. |
Proof of concept.
Benefits:
__array_namespace__
.arange
is now a statically understood signature, e.g.func: arange
is meaningful.Fixes #267