Skip to content

Add stubs for major protocols #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

Conversation

NeilGirdhar
Copy link
Contributor

No description provided.

Copy link
Collaborator

@nstarman nstarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great start!
Thanks @NeilGirdhar.
Maybe we can get @34j to comment on how automatic generation might integrate with this project. I imagine that filling out ArrayNamespace would be the prime target.

@NeilGirdhar
Copy link
Contributor Author

Maybe we can get @34j to comment on how automatic generation

100% agree. I thought about it, but I wasn't sure how it would work.

I imagine that filling out ArrayNamespace would be the prime target.

That makes sense!

@jorenham jorenham self-requested a review June 11, 2025 19:53
@nstarman
Copy link
Collaborator

@NeilGirdhar can we wait to merge this in until #18 is in? Hopefully we can get some CI up and running to test these additions :).

@NeilGirdhar
Copy link
Contributor Author

Sounds great, thanks for the thorough review! I know that @jorenham wants to review this as well 😄

@jorenham
Copy link
Member

jorenham commented Jun 13, 2025

Sounds great, thanks for the thorough review! I know that @jorenham wants to review this as well 😄

Yea that's right, thanks for the ping. I'll have a look at it soon.

Copy link
Member

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

This PR looks pretty similar to what I tried to build before in optype: https://github.com/jorenham/optype/tree/array-api/optype/array_api. There, I also started with the same enthusiasm as I see here. After spending a good amount of time on it, I realized that none of the protocols would be able to solve any problem in practice. So I'm a bit worried that this will happen again.

One of the mistakes I made, was that I started without being able to "see" anything: I should have instead gone with a test-driven approach. Because without integration tests you can't tell whether the relevant libraries will be compatible with the protocols. And in this optype case, it later realized none of the relevant libraries were compatible.

The other big mistake I made, is that I started without knowing the exact problem that I was trying to solve. This is something I still tend to underestimate, so I know how difficult it is.

And as hypocritical as it might sound, that's why I think we should first figure out what exactly the problems are that we're trying to solve with array-api-typing.

Comment on lines 6 to 8
@final
class Device(Protocol):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would be useful to be able to structurally type a device, but I don't think that that's possible given the current spec. Right now, this protocol is equivalent to Device = object. And since I don't see any way how we could expand it later on, I think it might be better to leave the the Device out. Because I'm worried that this object alias will give users a false sense of type-safety.

Copy link
Contributor Author

@NeilGirdhar NeilGirdhar Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about type-safety illusions, but I think we should at least have Device = object so that other libraries have something to point to. Even without type safety, just writing xpt.Device is a nicer annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I do in my code when I need to annotate a Jax random number generator key. I find that key: KeyArrayis a lot easier to read that key: Array. And, in theory, if we ever get dtype support for Jax array annotations, then we make KeyArray = jax.Array[KeyDType].


def asarray(
self,
obj: Array | complex | NestedSequence[complex] | SupportsBufferProtocol,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty heavy restriction. Because this will now reject your asarray method unless Array | complex | NestedSequence[complex] | Buffer is assignable to the type of your obj. That's because function domain types are contravariant, which is a consequence of the Liskov Substitution Principle (LSP).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's the promise that's shown on the Array API documentation. Did you want a narrower interface?

Comment on lines 33 to 46
def astype(
self,
x: A,
dtype: DType,
/,
*,
copy: bool = True,
device: Device | None = None,
) -> A: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if dtype is passed, the array type A will be unaffected. That, when combined with the invariance of A, means that we will not (and cannot) support array types with generic dtype parameter. I'm not sure if that's what we want.

Copy link
Contributor Author

@NeilGirdhar NeilGirdhar Jun 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good point. What I hoped for was that:

x: jax.Array
xp = array_api_compat.array_namespace(x)
y = xp.asarray(0.0)
reveal_type(y)  # jax.Array

Is there a way to have that without also promising that the dtype of x and y are the same?

I guess you could regenerate the entire set of typing stubs for each backend. Is that what we need to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose to make this a package instead of a module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if we would end up with more modules. But you're right, YAGNI, so should I switch it to a module?

@NeilGirdhar
Copy link
Contributor Author

NeilGirdhar commented Jun 14, 2025

This PR looks pretty similar to what I tried to build before in optype: https://github.com/jorenham/optype/tree/array-api/optype/array_api. There, I also started with the same enthusiasm as I see here. After spending a good amount of time on it, I realized that none of the protocols would be able to solve any problem in practice. So I'm a bit worried that this will happen again.

One of the mistakes I made, was that I started without being able to "see" anything: I should have instead gone with a test-driven approach. Because without integration tests you can't tell whether the relevant libraries will be compatible with the protocols. And in this optype case, it later realized none of the relevant libraries were compatible.

Yes, this is one of the main issues with the design decision to design the libraries first and the annotations second. If we had done it the other way around with the interfaces first and then have the libraries explicitly inherit from the interfaces, there wouldn't be a problem.

However, I think we already have access to an easy solution. We don't need the libraries to conform to the protocols. We just need the interface returned by array-api-compat to do so. If there are any inconsistencies (between protocol and library), these can be shimmed in array-api-compat.

From there, we can apply soft pressure on the libraries to conform more so that the shims slowly disappear.

What do you think?

The other big mistake I made, is that I started without knowing the exact problem that I was trying to solve. This is something I still tend to underestimate, so I know how difficult it is. And as hypocritical as it might sound, that's why I think we should first figure out what exactly the problems are that we're trying to solve with array-api-typing.

Yes, that makes sense. We're not writing code for its own sake.

For me, I would like my efax library to become typed. Currently, I've typed the arrays as jax.Array, but it's almost entirely xpt.Array. I would like to switch it. That means annotating all of my xp variables as xpt.ArrayNamesapce, and then hopefully xp.asarray, xp.sin, etc. will all be typed.

Also, I think it means that array_api_compat.array_namespace(array) where array: A needs to return ArrayNamespace[A].

This is why I simply lifted the annotations from array-api-compat here. Once we have annotations for ArrayNamespace, etc. here, hopefully we can just point the annotations from array-api-compat into array-api-typing as we discussed here and elswewhere.

What do you think?

@nstarman
Copy link
Collaborator

Sorry @NeilGirdhar. Rebase required.

@NeilGirdhar
Copy link
Contributor Author

Okay, I'm sorry, but I have to make a new PR. I'll link it here

@jorenham
Copy link
Member

We don't need the libraries to conform to the protocols. We just need the interface returned by array-api-compat to do so. If there are any inconsistencies (between protocol and library), these can be shimmed in array-api-compat.

If array-api-typing would fully rely on array-api-compat, then there would be no need for it, because then we could just put it all in there. What I had in mind (and I had the idea that I wasn't the only one) was for array-api-typing to be useful for any library that is compatible with the array API.
But even if we would make this an extension of arraya-api-compat, then we would still have to verify that it actually works, otherwise we risk wasting our time.

This is why I simply lifted the annotations from array-api-compat here.

Ah that explains, I thought I recognized them haha. But those protocols there are mostly meant as rough placeholders for the stuff that array-api-typing will properly address. So copying them over here has no additional value in that sense.

And as I've explained before, I don't think that these protocols are a good starting point for us, because I don't see they are able to help in solving the problems that array-api-typing is meant to solve.

Given the recent rekindled interest in this, I'll soon start working on writing down the plan that I had in mind for array-api-typing, and will write a proof-of-concept implementation. At the moment things are a bit hectic for me, since all of my family's birthdays fall in june, and because of the recent numpy 2.3.0 release and the upcoming scipy 1.16.0 release and the scipy-stubs 1.16.0.0 release (which I published last night: https://github.com/scipy/scipy-stubs/releases/tag/v1.16.0.0). So that will probably have to wait to july or something.

@jorenham jorenham mentioned this pull request Jun 18, 2025
@lucascolley
Copy link
Member

We don't need the libraries to conform to the protocols. We just need the interface returned by array-api-compat to do so.If there are any inconsistencies (between protocol and library), these can be shimmed in array-api-compat.

If array-api-typing would fully rely on array-api-compat, then there would be no need for it, because then we could just put it all in there. What I had in mind (and I had the idea that I wasn't the only one) was for array-api-typing to be useful for any library that is compatible with the array API.

I think the point is more that 'library compatible with the array API' is currently only true (if it is true) for those namespaces exposed by array-api-compat. i.e. we shouldn't be typing past the spec to accommodate existing incompatibilities—rather, the usual process applies, where issues are opened in the upstream libraries (either the array library or the spec, if the incompatible behaviour seems reasonable).

There is still value IMO in developing and offering typing in a separate repo and package, as the features of a) providing shims for existing libraries, and b) typing the spec, should be completely orthogonal. Of course, array-api-typing should have integration tests against existing libraries, and these tests may inform design decisions.

However much we 'rely' on array-api-compat is I think completely orthogonal from typing-specific concerns. The end-goal is to not have to rely on array-api-compat at all, and have all libraries be fully compliant, but this is true for the whole ecosystem, whether typing is used or not.

@NeilGirdhar
Copy link
Contributor Author

And as I've explained before, I don't think that these protocols are a good starting point for us, because I don't see they are able to help in solving the problems that array-api-typing is meant to solve.

One of the problems that we discussed in #22 is to be able to annotate code. I want to be able to do:

import array_api_typing as xpt
def f(xp: xpt.ArrayNamespace) -> xpt.Array: ...

I realize that it will be a giant engineering feat to accomplish the actual definitions. I just want something I can import and use while I wait for the definitions.

I'll soon start working on writing down the plan that I had in mind for array-api-typing, and will write a proof-of-concept implementation.

Great!

At the moment things are a bit hectic for me

No rush!

we shouldn't be typing past the spec to accommodate existing incompatibilities—rather, the usual process applies, where issues are opened in the upstream libraries (either the array library or the spec, if the incompatible behaviour seems reasonable).

100% agree. You can also work around those incompatibilities in array-api-compat while you wait for them to fix the incompatibilities.

However much we 'rely' on array-api-compat is I think completely orthogonal from typing-specific concerns. The end-goal is to not have to rely on array-api-compat at all, and have all libraries be fully compliant,

100% agree. But I do think that there's a bit too much optimism about libraries typing things properly. Unless you both plan on devoting a huge amount of time to Jax, Pytorch, etc., then from what I've noticed, those teams are not prioritizing type annotations over features and bugs. I expect them to regularly break things, so I don't really expect to get around using array-api-compat.

Still, I respect the optimism that array-api-compat should not be necessary, and the protocols should work in the case that someone has typed their library properly.


Regarding this PR: If we know the rough classes that xpt will expose, I'd like to provide some stubs for people (including me) to import. Right now, I'm using my own annotations. I would rather be using something closer to what we will ultimately use to save work.

PS I recently switched from numpy.typing to optype recently and found that many of my type-ignores were removable! I imagine that your approach will stick close to optype? If so, that's very exciting.

@lucascolley
Copy link
Member

lucascolley commented Jun 20, 2025

I do think that there's a bit too much optimism about libraries typing things properly ... so I don't really expect to get around using array-api-compat.

For sure, I don't see array-api-compat going away. But it is still useful to keep in mind that that is the utopian goal, as it helps correctly frame array-api-compat's scope as a shim library, rather than a cornerstone to build up extra features and so on. However much we make use of array-api-compat, it should always be due to things that feasibly could be addressed upstream, were the teams so inclined.

@jorenham
Copy link
Member

jorenham commented Jun 20, 2025

But I do think that there's a bit too much optimism about libraries typing things properly.

Perhaps we could provide type-test suite that validates if their static type annotations are compatible with array-api-typing 🤔


PS I recently switched from numpy.typing to optype recently and found that many of my type-ignores were removable!

That's good to hear!

I imagine that your approach will stick close to optype? If so, that's very exciting.

Yea pretty much. There are some things in optype that I wish I would've done differently. But for the most part optype has proven itself to be very useful in practice, e.g. in scipy-stubs.

For example, I think that the optypean "one protocol; one method" approach will be a good fit for array-api-typing. Applying this to your example, that would allow us to write it as

from typing import Protocol

class HasArrayType[ArrayT](Protocol):  # ArrayT is covariant
    @property
    def array(self, /) -> type[ArrayT]: ...

def f[ArrayT](xp: HasArrayType[ArrayT]) -> ArrayT: ...

Now we know the exact array type that the namespace provides. Even if the library's type annotations of their array type is a mess.
By nesting such single-method protocols, we'll be able to figure out very specific things, such as the boolean dtype and the default device type, since those are statically known via __array_namespace_info__().

Note that none of that requires a full-fledged xpt.Array protocol. So I suppose that's why I'm not that concerned about bad annotations in libraries that aren't array-api-compat.

@NeilGirdhar
Copy link
Contributor Author

Perhaps we could provide type-test suite that validates if their static type annotations are compatible with array-api-typing

Funny, we had the exact same idea! #30

@jorenham
Copy link
Member

Perhaps we could provide type-test suite that validates if their static type annotations are compatible with array-api-typing

Funny, we had the exact same idea! #30

Oh haha I missed that one. I guess this proves that great minds indeed think alike ;)

@nstarman
Copy link
Collaborator

"one protocol; one method"

I strongly agree (e.g. see https://cosmology.readthedocs.io/projects/api/latest/api/reference.html)

  • select intersection types: ArrayNamespace, Array

@jorenham
Copy link
Member

(e.g. see cosmology.readthedocs.io/projects/api/latest/api/reference.html)

ohh that's pretty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants