You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see about 2-10% perf degradation in OpenBLAS/gemm.c benchmark on single core ( also seen on multicore) of graviton3 machine.
This is the issue due to P*Q used in NEOVERSEV1 is not comparable to what it’s L2 cache size (which is 1MB).
from the faq of OpenBLAS : A general rule of thumb for selecting a starting point seems to be that PxQ is about half the size of L2 cache.
So may be we should update the NEOVERSEV1 param P*Q as per the L2 cache size.
Or can anyone help in understanding why NEOVERSEV1 P*Q is kept that low.