-
Notifications
You must be signed in to change notification settings - Fork 5
Add stubs for major protocols #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great start!
Thanks @NeilGirdhar.
Maybe we can get @34j to comment on how automatic generation might integrate with this project. I imagine that filling out ArrayNamespace
would be the prime target.
100% agree. I thought about it, but I wasn't sure how it would work.
That makes sense! |
@NeilGirdhar can we wait to merge this in until #18 is in? Hopefully we can get some CI up and running to test these additions :). |
Sounds great, thanks for the thorough review! I know that @jorenham wants to review this as well 😄 |
Yea that's right, thanks for the ping. I'll have a look at it soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR!
This PR looks pretty similar to what I tried to build before in optype: https://github.com/jorenham/optype/tree/array-api/optype/array_api. There, I also started with the same enthusiasm as I see here. After spending a good amount of time on it, I realized that none of the protocols would be able to solve any problem in practice. So I'm a bit worried that this will happen again.
One of the mistakes I made, was that I started without being able to "see" anything: I should have instead gone with a test-driven approach. Because without integration tests you can't tell whether the relevant libraries will be compatible with the protocols. And in this optype case, it later realized none of the relevant libraries were compatible.
The other big mistake I made, is that I started without knowing the exact problem that I was trying to solve. This is something I still tend to underestimate, so I know how difficult it is.
And as hypocritical as it might sound, that's why I think we should first figure out what exactly the problems are that we're trying to solve with array-api-typing.
src/array_api_typing/_device.py
Outdated
@final | ||
class Device(Protocol): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it would be useful to be able to structurally type a device, but I don't think that that's possible given the current spec. Right now, this protocol is equivalent to Device = object
. And since I don't see any way how we could expand it later on, I think it might be better to leave the the Device
out. Because I'm worried that this object
alias will give users a false sense of type-safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point about type-safety illusions, but I think we should at least have Device = object
so that other libraries have something to point to. Even without type safety, just writing xpt.Device
is a nicer annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I do in my code when I need to annotate a Jax random number generator key. I find that key: KeyArray
is a lot easier to read that key: Array
. And, in theory, if we ever get dtype support for Jax array annotations, then we make KeyArray = jax.Array[KeyDType]
.
src/array_api_typing/_namespace.py
Outdated
|
||
def asarray( | ||
self, | ||
obj: Array | complex | NestedSequence[complex] | SupportsBufferProtocol, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty heavy restriction. Because this will now reject your asarray
method unless Array | complex | NestedSequence[complex] | Buffer
is assignable to the type of your obj
. That's because function domain types are contravariant, which is a consequence of the Liskov Substitution Principle (LSP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's the promise that's shown on the Array API documentation. Did you want a narrower interface?
src/array_api_typing/_namespace.py
Outdated
def astype( | ||
self, | ||
x: A, | ||
dtype: DType, | ||
/, | ||
*, | ||
copy: bool = True, | ||
device: Device | None = None, | ||
) -> A: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if dtype
is passed, the array type A
will be unaffected. That, when combined with the invariance of A
, means that we will not (and cannot) support array types with generic dtype parameter. I'm not sure if that's what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really good point. What I hoped for was that:
x: jax.Array
xp = array_api_compat.array_namespace(x)
y = xp.asarray(0.0)
reveal_type(y) # jax.Array
Is there a way to have that without also promising that the dtype of x
and y
are the same?
I guess you could regenerate the entire set of typing stubs for each backend. Is that what we need to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you choose to make this a package instead of a module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if we would end up with more modules. But you're right, YAGNI, so should I switch it to a module?
Yes, this is one of the main issues with the design decision to design the libraries first and the annotations second. If we had done it the other way around with the interfaces first and then have the libraries explicitly inherit from the interfaces, there wouldn't be a problem. However, I think we already have access to an easy solution. We don't need the libraries to conform to the protocols. We just need the interface returned by From there, we can apply soft pressure on the libraries to conform more so that the shims slowly disappear. What do you think?
Yes, that makes sense. We're not writing code for its own sake. For me, I would like my efax library to become typed. Currently, I've typed the arrays as Also, I think it means that This is why I simply lifted the annotations from What do you think? |
Sorry @NeilGirdhar. Rebase required. |
Okay, I'm sorry, but I have to make a new PR. I'll link it here |
If
Ah that explains, I thought I recognized them haha. But those protocols there are mostly meant as rough placeholders for the stuff that array-api-typing will properly address. So copying them over here has no additional value in that sense. And as I've explained before, I don't think that these protocols are a good starting point for us, because I don't see they are able to help in solving the problems that array-api-typing is meant to solve. Given the recent rekindled interest in this, I'll soon start working on writing down the plan that I had in mind for array-api-typing, and will write a proof-of-concept implementation. At the moment things are a bit hectic for me, since all of my family's birthdays fall in june, and because of the recent numpy 2.3.0 release and the upcoming scipy 1.16.0 release and the scipy-stubs 1.16.0.0 release (which I published last night: https://github.com/scipy/scipy-stubs/releases/tag/v1.16.0.0). So that will probably have to wait to july or something. |
I think the point is more that 'library compatible with the array API' is currently only true (if it is true) for those namespaces exposed by array-api-compat. i.e. we shouldn't be typing past the spec to accommodate existing incompatibilities—rather, the usual process applies, where issues are opened in the upstream libraries (either the array library or the spec, if the incompatible behaviour seems reasonable). There is still value IMO in developing and offering typing in a separate repo and package, as the features of a) providing shims for existing libraries, and b) typing the spec, should be completely orthogonal. Of course, array-api-typing should have integration tests against existing libraries, and these tests may inform design decisions. However much we 'rely' on array-api-compat is I think completely orthogonal from typing-specific concerns. The end-goal is to not have to rely on array-api-compat at all, and have all libraries be fully compliant, but this is true for the whole ecosystem, whether typing is used or not. |
One of the problems that we discussed in #22 is to be able to annotate code. I want to be able to do:
I realize that it will be a giant engineering feat to accomplish the actual definitions. I just want something I can import and use while I wait for the definitions.
Great!
No rush!
100% agree. You can also work around those incompatibilities in array-api-compat while you wait for them to fix the incompatibilities.
100% agree. But I do think that there's a bit too much optimism about libraries typing things properly. Unless you both plan on devoting a huge amount of time to Jax, Pytorch, etc., then from what I've noticed, those teams are not prioritizing type annotations over features and bugs. I expect them to regularly break things, so I don't really expect to get around using array-api-compat. Still, I respect the optimism that array-api-compat should not be necessary, and the protocols should work in the case that someone has typed their library properly. Regarding this PR: If we know the rough classes that PS I recently switched from |
For sure, I don't see array-api-compat going away. But it is still useful to keep in mind that that is the utopian goal, as it helps correctly frame array-api-compat's scope as a shim library, rather than a cornerstone to build up extra features and so on. However much we make use of array-api-compat, it should always be due to things that feasibly could be addressed upstream, were the teams so inclined. |
Perhaps we could provide type-test suite that validates if their static type annotations are compatible with array-api-typing 🤔
That's good to hear!
Yea pretty much. There are some things in optype that I wish I would've done differently. But for the most part optype has proven itself to be very useful in practice, e.g. in scipy-stubs. For example, I think that the optypean "one protocol; one method" approach will be a good fit for array-api-typing. Applying this to your example, that would allow us to write it as from typing import Protocol
class HasArrayType[ArrayT](Protocol): # ArrayT is covariant
@property
def array(self, /) -> type[ArrayT]: ...
def f[ArrayT](xp: HasArrayType[ArrayT]) -> ArrayT: ... Now we know the exact array type that the namespace provides. Even if the library's type annotations of their array type is a mess. Note that none of that requires a full-fledged |
Funny, we had the exact same idea! #30 |
Oh haha I missed that one. I guess this proves that great minds indeed think alike ;) |
I strongly agree (e.g. see https://cosmology.readthedocs.io/projects/api/latest/api/reference.html)
|
ohh that's pretty |
No description provided.