-
-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add advanced indexing support #114
Comments
Hi! Advanced indexing support is definitely on the to-do list (see #1), however it would be extremely inefficient to assign with It isn't a high priority right now, but a PR is always welcome. Also, since you have shown interest, the priority went up. If you want fast writes, If we want to make advanced indexing super-fast for both If you want to interleave reads and writes, Hope this helps! Making |
Howvever, if assigning to |
I'm blocked on this by numba/numba#2560 or #126. |
cc @woodmd would it be good enough if we converted on-the-fly between the most optimal formats for each operation? So writing would automatically convert it to DOK, element-wise ops or reads to COO, for example? |
I don't think having element-wise update is critical for us. The use-case would generally be reading/updating a very large number of elements at once. Probably for updating the easiest approach would be to instantiate a new COO on the fly. |
The following indexing pattern no longer works on master. I believe the problem originates at commit 222fb8c: import sparse
import numpy as np
s = sparse.random((100, 100, 100), density=.1)
idx = np.random.permutation(100)
s[idx]
|
This is correct, it's a known issue, and it was discussed in the PR related to that commit. We sacrificed this feature for performance. If you're interested, I can walk you through how to add full-blown advanced indexing. I have no immediate plans to do this myself, however. |
I'd definitely be happy to help with this! (As long as you don't think it would take too much effort on your part to get me started on it.) |
Great! It is a bit of work to understand, but not too much work to actually implement. I'll say this right now: This isn't a cup of tea for a beginning contributor to this library, and it's completely okay if you're not okay doing this. That said, I'm more than willing to help guide you through it, in case you have questions or need any kind of support (algorithmic or technical). Don't hesitate to ask anything (either on this issue or on any related PR you create), and I'll help where I can. Most of what you need is in https://github.com/pydata/sparse/blob/8f2a9aebe595762eace6bc48531119462f979e21/sparse/coo/indexing.py
A couple of quick side notes:
What you will need to do (no pressure):
A couple of things you might need to know during the implementation phase:
|
@ahwillia Seems like an older thread, but I was wondering if there's been any progress made on this front? |
#343 adds this for the one-dimensional case. |
I would like to have support for advanced indexing for both retrieval and assignment. Ideally I was hoping to find something that could serve as a drop-in replacement for
numpy.ndarray
for these types of operations. Is this functionality something that would be in the scope of this library? Are there any thoughts on how it should be implemented?On a related note I noticed that
COO
is currently immutable and thus doesn't doesn't allow item assignment. However I wonder if one could support assignment by havingCOO
make an in-place copy of itself. Of course this will be extremely inefficient for updating a single element but when addressing a large number of elements in parallel the overhead from the copy should be more manageable. Of course in the documentation you could stress that setting elements ofCOO
individually is not recommended.The text was updated successfully, but these errors were encountered: