We now know inline code is faster than BLAS for 2x2 at least. Probably also helps for matrix*vector and backslash.