-
Notifications
You must be signed in to change notification settings - Fork 359
Optimization_4x4_13
Jianyu Huang edited this page Aug 11, 2016
·
4 revisions
Copy the contents of file MMult_4x4_12.c into a file named MMult_4x4_13.c and change the contents:
Change the first lines in the makefile to
OLD := MMult_4x4_12
NEW := MMult_4x4_13make run
octave:3> PlotAll % this will create the plotThis time the performance graph will look something like

This version saves the packed blocks of A so that after the first iteration of the outer loop of InnerKernel, the saved version is used. The performance gain is noticeable! The only change from the last version is the addition of if ( j== 0 ):
if ( j == 0 ) PackMatrixA( k, &A( i, 0 ), lda, &packedA[ i*k ] );