Fastor V0.6.1
Fastor V0.6.1 is an incremental change over V0.6 release that introduced a significant overhaul in Fastor's internal design and exposed API. This release includes
lu
function introduced for LU decomposition of 2D tensors. Multiple variants of LU decomposition is available including no pivoting, partial pivoting with a permutation vector and partial pivoting with a permutation matrix. This is perhaps the most performant implementation of the LU decomposition available today for small matrices of up to64x64
. If no pivoting is used it the performance is unbeaten for all sizes up to the stack limit however given that the implementation is based on compile time loop recursion for sizes up to32x32
and further it uses block recursion which in turn uses block-triangular-inversion compilation would be quite time consuming for bigger sizesut_inverse
andlut_inverse
for fast triangular inversion of upper and unit lower matrices using block-wise inversiontmatmul
function equivalent to BLAS'sTRMM
function for triangular matrix-matrix (or vector) multiplication which allows either or both operand to be upper/lower triangular. The function can be used to specifiy which matrix is lower/upper at compile time liketmatmul<matrix_type::lower_tri,matrix_type::general>(A,B)
. Proper 2X speed up over matmul for when one operand is triangular and 4X when both are triangular can be achieved for bigger sizesdet/determinant
can now be computed for all sizes using the LU decomposition [default for matrix sizes bigger than4x4
].inv/inverse
andsolve
can be performed with any variant of the LU decomposition- There is now a unified interface for choosing the computation type of linear algebra functions for instance
det<DetCompType::BlockLU>(A)
orinv<InvCompType::SimpleLUPiv>(A)
orsolve<SolveCompType::BlockLUPiv>
etc tril/triu
functions added for getting the lower/upper part of a 2D tensor- Comprehensive unit tests and benchmarks are added and are available for these newly added (and some old) routines