Skip to content

optimize matrix multiplicaton for C and F matrix #58

@dastrobu

Description

@dastrobu

currently both matrices are always transformed to the same memory layout before multiplying. C and F contiguous matrices could be handled as is:

    if A.isFContiguous {
        order = CblasColMajor
        a = Matrix(A, order: .F)
        b = Matrix(B, order: .F)
        c = Matrix<Float>(empty: [Int(m), Int(n)], order: .F)
        lda = Int32(a.strides[1])
        ldb = Int32(b.strides[1])
        ldc = Int32(c.strides[1])
    } else {
        order = CblasRowMajor
        a = Matrix(A, order: .C)
        b = Matrix(B, order: .C)
        c = Matrix<Float>(empty: [Int(m), Int(n)], order: .C)
        lda = Int32(a.strides[0])
        ldb = Int32(b.strides[0])
        ldc = Int32(c.strides[0])
    }

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions