Skip to content

dgemm performance degradation on ARM NEOVERSEV1 with lower P*Q  #4323

@darshanp4

Description

@darshanp4

I see about 2-10% perf degradation in OpenBLAS/gemm.c benchmark on single core ( also seen on multicore) of graviton3 machine.
This is the issue due to P*Q used in NEOVERSEV1 is not comparable to what it’s L2 cache size (which is 1MB).
from the faq of OpenBLAS : A general rule of thumb for selecting a starting point seems to be that PxQ is about half the size of L2 cache.

image

So may be we should update the NEOVERSEV1 param P*Q as per the L2 cache size.

Or can anyone help in understanding why NEOVERSEV1 P*Q is kept that low.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions