Skip to content

SparseMatrixCSC should probably be an immutable #15668

Closed
@KristofferC

Description

@KristofferC

Currently SparseMatrixCSC is a mutable type (in contrast with SparseVector which is an immutable). This causes a few problems in the code base for sparse matrices.

The main problem is that LLVM is not able to hoist field accesses in loops for types. This is presumably because it is hard to prove that the address to the field stays the same during the loop. Therefore, in order to get good performance, almost all sparse matrix function starts with something similar to:

nzvalA = A.nzval
rowvalA = A.rowval
colptrA = A.colptr
nzvalB = B.nzval
rowvalB = B.rowval
colptrB = B.colptr

This is of course quite ugly, and different authors have different names for the hoisted fields leading to the code base being less friendly to read. It is also easy to forget the manual hoisting, which leads to less than ideal performance. A third problem is that you cannot use convenience functions like nzrange without a performance loss since this will cause extra dereferences of fields. The effect of this is that the sparse code base is pretty much fully inlined with almost no convenience functions being used.

The places where the mutability of the fields is used is in spset!, spdelete!, and three setindex! functions so not that many places. The way it is used in these places are that the "sparsity pattern" is being remade from scratch with new arrays and in the end the original fields of the matrix is simply swapped out for the new arrays. I am quite confident that rewriting these to create the new sparsity pattern in the already existing field is not too difficult.

I currently have a branch were everything builds and tests pass with SparseMatrixCSC being an immutable but the way I did it causes an extra copy in the functions mentioned previously. If I were to rewrite it to have the same performance as now, would a PR making SparseMatrixCSC immutable be desired? A follow up PR or a commit in the same one could look at removing all the manually hoisted fields and benchmarking to see that nothing gets slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    sparseSparse arrays

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions