Skip to content

Iterator performance tips #86

@bluss

Description

@bluss

I'm briefly looking at rulinalg’s SliceIter. You can use the same kind of optimization used in ndarray's element iterator.

The brief idea of the "problem" is that this is efficent:

for i in rows
   for j in row(i)
       // access element at i, j

SliceIter::next() (just like ndarray's Iter::next) emulates this segmented traversal with a bit of a state machine. Unfortunately the compiler doesn't know how to turn this state machine back into two nested for loops(!)

Two main ideas I have used:

  1. Use the regular [T] iterator when the matrix is contiguous. A branch on this even inside the .next() method seems to work well, the conditional is lifted out of the loop.
  2. When you can't improve .next() for the non-contiguous case, you can at least special case Iterator::fold(). It is essentially the same idea as the .chain() change in Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators. rust-lang/rust#37315 ; you recurse and fold a bunch of component iterators (since each row is contiguous, you just call the slice iterator's fold on each row in turn).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions