Description
Currently SparseMatrixCSC
is a mutable type
(in contrast with SparseVector
which is an immutable
). This causes a few problems in the code base for sparse matrices.
The main problem is that LLVM is not able to hoist field accesses in loops for type
s. This is presumably because it is hard to prove that the address to the field stays the same during the loop. Therefore, in order to get good performance, almost all sparse matrix function starts with something similar to:
nzvalA = A.nzval
rowvalA = A.rowval
colptrA = A.colptr
nzvalB = B.nzval
rowvalB = B.rowval
colptrB = B.colptr
This is of course quite ugly, and different authors have different names for the hoisted fields leading to the code base being less friendly to read. It is also easy to forget the manual hoisting, which leads to less than ideal performance. A third problem is that you cannot use convenience functions like nzrange
without a performance loss since this will cause extra dereferences of fields. The effect of this is that the sparse code base is pretty much fully inlined with almost no convenience functions being used.
The places where the mutability of the fields is used is in spset!
, spdelete!
, and three setindex!
functions so not that many places. The way it is used in these places are that the "sparsity pattern" is being remade from scratch with new arrays and in the end the original fields of the matrix is simply swapped out for the new arrays. I am quite confident that rewriting these to create the new sparsity pattern in the already existing field is not too difficult.
I currently have a branch were everything builds and tests pass with SparseMatrixCSC
being an immutable but the way I did it causes an extra copy in the functions mentioned previously. If I were to rewrite it to have the same performance as now, would a PR making SparseMatrixCSC
immutable be desired? A follow up PR or a commit in the same one could look at removing all the manually hoisted fields and benchmarking to see that nothing gets slower.