-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling similar(...) on AbstractCuSparseArray with dims #1184
base: master
Are you sure you want to change the base?
Conversation
Calling similar() on a variable var of type :<AbstractCuSparseArray along with dims argument will return a CuArray of type eltype(var) with dimensions given by dims.
calling similar on AbstractCuSparseArray with dims
Thanks for the contribution! I'm not sure I fully understand the fix though, shouldn't
Furthermore, a wrapped array isn't a subtype of the original array:
I've also marked your PR as draft so that it doesn't kick off unnecessary CI. It would need some tests for that to be useful anyway. |
When
You're right. I thought I was mimicing julia's regular sparse behavior, but I just realized how and why |
Added more robust handling of applying similar on a CuSparseMatrixCSC/CSR, following the logic used in SparseArrays.jl
robust handling of similar CSC/CSR
two methods CSR were originally added as extensions to SparseArrays.jl, but SparseArrays.jl does not support CSR.
fixed getrowptr,colvals
forgot Base. prefix to several instances of similar
used fill instead of CUDA.fill in matrix initialization
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1184 +/- ##
==========================================
+ Coverage 80.48% 80.50% +0.02%
==========================================
Files 119 119
Lines 8393 8454 +61
==========================================
+ Hits 6755 6806 +51
- Misses 1638 1648 +10 ☔ View full report in Codecov by Sentry. |
unclear what is supported in CUDA.jl for BSR,COO, so they remain untested.
Cu sparse Base.similar tests completed for CSC and CSR. No tests for BSR, COO, as it is unclear to what extent these are supported in CUDA.jl.
splatted 1D tuple dims when creating vector
@maleadt So it looks like the tests failed, but not because of me. Some error in the comparison |
CI on master is green, the failure is in |
Added method for colon indexing a CuSparseMatrix, and a conversion from CuSparseMatrix to CuSparseMatrix to vector. Fixed show method of vectors to be same as for SparseArrays. CuSparseVector.dims is a 2-tuple instead of an integer, probably as a holdover from the base.show method that was written to fit both vectors and matrices, and needed to be able to run 'size(A)[2]'. That can probably be changed now, but I don't want to break anyone's code.
@maleadt It looks like there were a bunch of tests that did indexing of a CuSparseMatrix and compared them to a CPU sparse matrix, a la |
Copying an entire sparse array to the CPU when you only want to access a single element seems excessive... Scalar iteration is slow because it performs a |
It's not for a single element. Specifically, what exists now (and returns the right type):
What I'm talking about is indexing where one or both of the indices is a vector or range. In master, if one is a vector/range, you will get a cpu array. If both are, you will get an error (which is why the test failed). The simplest fix is by converting to CPU and back. Admittedly, this is probably not the most efficient way to go about adding those features, but I'm not sure there is a good method to do this kind of slicing without doing scalar iteration anyway, which is why I am proposing this method. |
Cu sparse similar matrix indexing
updating to match current master branch
Similar, indexing CuSparseArrays, Vector reshaping
All tests passed except those that fail |
@@ -200,6 +200,59 @@ Base.similar(Mat::CuSparseMatrixCSC) = CuSparseMatrixCSC(copy(Mat.colPtr), copy( | |||
Base.similar(Mat::CuSparseMatrixCSR) = CuSparseMatrixCSR(copy(Mat.rowPtr), copy(Mat.colVal), similar(nonzeros(Mat)), Mat.dims) | |||
Base.similar(Mat::CuSparseMatrixBSR) = CuSparseMatrixBSR(copy(Mat.rowPtr), copy(Mat.colVal), similar(nonzeros(Mat)), Mat.blockDim, Mat.dir, nnz(Mat), Mat.dims) | |||
|
|||
## similar for CSC,CSR with dims, eltype arguments, adapted from SparseArrays.jl. NOTE: calling similar() on a wrapped CuSparseMatrixBSR/COO will result in a CuArray. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wrap the NOTE
on a new line.
But also: why does similar
still result in a dense vector? #1184 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't - I spent a lot of time on that. I'll check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I don't have the time for an in depth review right now, so some quick comments.
|
||
# Handles cases where CSC/CSR are reshaped to more than two dimensions, and all cases above for COO/BSR. | ||
Base.similar(a::AbstractCuSparseArray, ::Type{T}, dims::Base.Dims{N}) where {T,N} = | ||
CuArray{T,N}(undef, dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent?
Comment from above also applies.
r1 = convert(Int, x.colPtr[j]) | ||
r2 = convert(Int, x.colPtr[j+1]) - 1 | ||
CuSparseVector(rowvals(x)[r1:r2], nonzeros(x)[r1:r2], size(x, 1)) | ||
Base.getindex(A::CuSparseMatrixCSC,I::AbstractVector,J::AbstractVector) = CuSparseMatrixCSC(getindex(CuSparseMatrixCSR(getindex(A,:,J)),I,:)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wrap these two methods for legibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you mean? Write a method for Base.getindex(A::CuSparseMatrixCSCR,I::AbstractVector,J::AbstractVector)
that calls either of these two methods, depending on type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just line wrapping :-) i.e. adding a newline after the =
newcolptr = cumsum(diffVec) | ||
|
||
indices = map((x,y)->range(x,stop=y),colptr[I_sorted],colptr[I_sorted.+1].-1) | ||
@CUDA.allowscalar indices = vcat(indices...) #TODO: figure out how to avoid this step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA
module prefix shouldn't be needed.
But the scalar indexing is not good. where does it come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to do here is this: get an array indices
of vectors, where indices[i]
contains a vector of the indices of all elements in nzvals
that are in the I_sorted[i]
row (or column, depending on CSC or CSR) of the input matrix. After that, indexing becomes simple.
However, the map
here returns a CuArray of ranges, and any indexing I do with that will cause a bug, so I need to turn the ranges into vectors first. If I had a method to get a CPU array of CuVectors from the map
, instead of a CuArray
of ranges, that would be unnecessary. Applying collect.(indices)
gives me an "inline variables only" type of error because it tries to instantiate a CuArray of arrays of (sometimes) different sizes. Applying vcat(indices...)
gives a scalar indexing error.
The other, more elegant method of solving this, is actually doing what I did in conversions.jl
that you commented on - transforming colptr
to a CPU array, in which case I can just get an array of arrays using a comprehension or a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the map here returns a CuArray of ranges, and any indexing I do with that will cause a bug, so I need to turn the ranges into vectors first.
Could you elaborate? And conversions to CPU data is extremely expensive and to be avoided if possible, so I'd rather look into fixing the bug you're encountering.
@@ -446,6 +535,7 @@ Base.copy(Mat::CuSparseMatrixCOO) = copyto!(similar(Mat), Mat) | |||
|
|||
# input/output | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some needless whitespace changes?
@@ -483,7 +573,6 @@ for (gpu, cpu) in [CuSparseMatrixCSC => SparseMatrixCSC, | |||
end | |||
end | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some needless whitespace changes?
@@ -0,0 +1,143 @@ | |||
using CUDA.CUSPARSE | |||
using SparseArrays | |||
using Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to import Test
-- run select tests by doing julia -L test/setup.jl test/cusparse/array.jl
I = 1:m | ||
nzval = nonzeros(Mat) | ||
nzind = rowvals(Mat) | ||
colptr = Array(getcolptr(Mat)) #TODO: figure out how to avoid this step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems bad. there's no API call for such conversions? or maybe try doing the conversion on the GPU to avoid synchronizing during the copy to/from the CPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment Re: scalar indexing above - there I try doing it on the GPU and end up doing scalar indexing. Do you mean API calls to the NVidia CUDA library? I had a look now and couldn't find it, but I will give it a bit more effort later.
Calling
similar(var,dims)
orsimilar(var,type,dims)
on a variablevar
withtypeof(var):<AbstractCuSparseArray
will return aCuArray
of eltypeeltype(var)
with dimensions given bydims
. Previously, callingsimilar(reshape(var,dims))
would result in Base.abstractarray.jl and Base.reshapedarray.jl expanding the call tosimilar(reshape(var,dims),eltype,dims)
, which was not handled in this code and resulted in a call toArray(...)
and notCuArray(...)
, forcing use of CPU and scalar indexing.