When computing the (multi input / multi output) kernel matrix for a convolution operator, it is advantageous to permute the rows of the matrix to reduce the number of non-zero diagonals. This in turn reduces the number of diagonals that have to be extracted to compute the MatVec.
The h and w were interchanged. This is a problem if the filter is non square.
I'm also noticing an error in flattening the final output in the ISL string.
Originally posted by @asraa in #2941
When computing the (multi input / multi output) kernel matrix for a convolution operator, it is advantageous to permute the rows of the matrix to reduce the number of non-zero diagonals. This in turn reduces the number of diagonals that have to be extracted to compute the MatVec.
The
handwwere interchanged. This is a problem if the filter is non square.Originally posted by @asraa in #2941