-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New array views based on stagedfunctions #8501
Conversation
Oh, also, I really don't think it's practical to switch to returning views until we have #8227. |
I should add that many features of the design are similar to #8235, which you can read for an overview. But basing it on stagedfunctions allows one to avoid the difficult choices between providing enough types and increasing the load time. |
@timholy I think it is time to synchronize our efforts towards array views in base. As far as I can see, your and @jakebolewski's and my efforts have been roughly complementary, but we are approaching overlapping work. Our work has been based on @lindahua's package, but much of the work doesn't depend on the choice of view type. It has proven to be two different problems to introduce one-dimensional and multidimensional range indexing in base. Much of the string related code and the code in We have also started work on multidimensional views and here we are probably overlapping with your efforts. An important step for getting this to work for @lindahua's implementation is to disable the colon rewrite in We should try to figure out if it is easier to merge this pull request and add @jakebolewski's and mine on top on that or the other way around. I think that it is better to merge your work and then add @jakebolewski's and mine on top, i.e. return views from
but there is probably a better solution. |
If we have both types, are you planning on writing an implementation of Getting the colon translation out of the parser might still be a good idea, but with this approach it isn't actually necessary unless you want to extend it to handle Contiguous views (which you could do by adding a 5th boolean type parameter). My hope, though, is that this may soon not be needed anymore (see #8432). However, the work you folks did to make |
It appears to me that the contiguous property is pretty useful to be able to dispatch on, although I'm not sure that I'll use it that much. I'm mainly interested in avoiding copying when doing BLAS operations on a matrix. I must admit that I don't fully understand First of all, is what you call the general case a situation where you want to compute for something like?
If so, it appears to me that this is a special kind of indexing where you want to go through all elements one by one. Next, could you explain (or point to an explanation of) why your approach is "Cartesian"? I don't understand the reference. Regarding the |
Sorry, I should have explained better; I didn't realize that was the issue. By cartesian indexing, I don't mean "uses Base.Cartesian," I mean accessing the elements of a 2d array like this: for j = 1:size(A,2), i = 1:size(A,1)
s += A[i,j]
end rather than as for i = 1:length(A)
s += A[i]
end The reason is simple: in the general case (meaning, where for i = 1:length(A)
i_2, i_1 = divrem(i-1, size(A,1))
s += A[i_1+1, i_2]
end and EDIT: This is also why #8432 is important: it gives users "easy" cartesian indexing without needing to know a darned thing about |
Regarding I use this all the time in my image processing code, because sometimes you want to preserve dimensions ( |
But if we have only one type of view, that means we give up performance for linear indexing of contiguous views of arrays, right? I think we should either do everything we can to make linear indexing as fast as possible or get rid of it entirely. |
We should definitely make linear indexing as fast as possible. But independently, you can integrate ArrayViews's ContiguousView into this SubArray variant: just add a 5th boolean parameter. The Doing this automatically for |
But my recommendation would be to fix #8504, then merge this, then implement contiguous views, then work on the parser. |
That sounds like a good plan to me. |
6c7c7e3
to
1a4c02f
Compare
066f346
to
f6210cd
Compare
We finally have green. So, I've removed the WIP tag. Now is a good time to complain about anything you don't like. Along those lines, I'll point out that this does not support the Meanwhile, I'll look into what would be needed to add support for efficient linear indexing à la |
Hooray! |
@timholy what about having a benchmark to compare these with ArrayViews and ArrayViewsAPL? I think it is good to go if the benchmark shows that it is fast enough. |
I've benchmarked similar things before, but yeah, I should do that, because now this is serious 😄. #8432 has some semi-relevant benchmarks, but it's with the old SubArrays. Until we include efficient linear indexing for those types that can support it, this will not be as efficient as linear indexing on a |
f6210cd
to
5702fc6
Compare
OK, time for an update here. With the merger of #8432, things are looking pretty good. Here's a benchmark script. If one just tallies up the differences, I find 4 cases in which this is at least 4x faster than ArrayViews, and 1 case in which ArrayViews is 4x faster than this. There are also several factor-of-two differences that mostly favor this PR. Otherwise the two are pretty similar.
By comparison with ArrayViews, the only serious weakness is linear indexing with the |
OK, I incorporated efficient linear indexing. Seems to work, and it fixes the last cases in which performance is notably poor compared to
Note in particular the excellent linear indexing performance for the Performance aside, I suspect I should write a few more tests to make sure this is really working properly, but it does pass the test suite on my machine (and hopefully on Travis, too). Because I don't want to become the sole maintainer of Julia's Array frameworks for perpetuity 😄, I'm also writing up a devdoc on how all this works (it's pretty darn straightforward, albeit somewhat tedious). I'll try to push those over the next couple of days, but meanwhile feel free to kick the tires and comment. Can anyone else feel the nirvana soaking in? |
This incorporates important innovations of ArrayViews
0d8210e
to
b5705d3
Compare
New array views based on stagedfunctions
🍰 |
We have new SubArrays. While the timings above emphasized the comparison to ArrayViews, it's worth mentioning that compared to the SubArrays we have in julia 0.3, these are ~2-10x faster to use for indexing, and 20-50x faster to construct. Those factors should be large enough to make a difference in many applications. @andreasnoack and @jakebolewski, as discussed I turn phase II (making Meanwhile, I'm off to begin the process of integrating the new iteration framework more deeply into base, as it should enable improvements in several functions. I'll probably be experimenting with moving some things forward in the bootstrap sequence (mostly, cartesian.jl and the multidimensional iterators). Hopefully that won't conflict with your own work. |
Sorry I haven't been paying attention here, but this looks awesome. As part of this, I'd like to request that |
👏 |
That sounds reasonable. |
There seem to be lots of crashes during jl_instantiate_staged after this, e.g. https://travis-ci.org/JuliaLang/julia/jobs/41586398 . Don't know if this caused it, but it's certainly triggering it. |
@timholy I wholeheartedly think that this is a major step forward for Julia. Thanks for all the efforts! |
Thanks, everyone!
That's weird; I pushed and forced-pushed this branch something like 20 times, and (once I got the bugs worked out) got a green check basically every time. So I felt pretty confident about it, despite the fact that bugs in stagedfunctions (#8504, #8505, #8944, plus my grant 😄) were the main thing that held this PR up. But yes, the SubArray work exercises stagedfunctions with a vengence, so it's not unlikely that it's uncovering problems. I also have another PR coming that will amplify their usage dramatically, and much more widely in typical usage (not that much code yet uses SubArrays, since they have been so problematic). Meanwhile, if anyone can come up with a reproducible way to make the crash happen, it would be great. |
…, due to JuliaLang/julia#8501. Should now be working on 0.4.
Here's my updated proposal for an Array view type. With many different implementations and names floating around (ArrayViews, View, ArrayViewsAPL), to distinguish this one I decided on a bold and innovative name:
SubArray
.However, these aren't the SubArrays your grandma played with out behind the old tire swing:
type
s.Vector{Int}
along any axis)A = reshape(1:120,5,4,6); B = sub(A, 1:5, 3:18)
).I realized only recently that these last two properties are required if we're going to consistently return views from indexing. Getting this working properly was the trickiest part.
This container also shares a number of features with our current SubArrays:
slice
s (dropping scalar-indexed dimensions) as well assub/getindex
style indexing.This doesn't make the transition to returning views from indexing. I think we should merge the container and #8432 first, give them a little shaking out, and see where we stand. I already know of a couple likely performance bottlenecks, which I've annotated with comments.
This gets much of the way through all of our tests, but seems to uncover a bug with stagedfunctions in the
broadcast
tests. So this shouldn't be merged until that gets fixed. (I'll document that later after I've had a little time to investigate.)@simonster,
merge_indexes_div
would be fertile territory for your contemplateddiv
speedups.Also, over at https://github.com/timholy/ArrayViewsAPL.jl (where all the development actually happened), I have a much-expanded set of tests. If desired, I can merge these into base.