Skip to content

performance issue with dnrm2 on zen2 processor (AMD ryzen7 3700x) #4188

Open
@elc42

Description

@elc42

Hello,
We notice a strange performance of dnrm2 compared to ddot: the performance of dnrm2 is very low compared to ddot.
This is observed with v0.3.23 and v0.3.18, other versions weren't tested.
Below 2 source files that reproduce the problem (sorry for txt extension)
Rename cxxopts.txt and TestOpenBLASNrm2Anomaly.txt to cxxopts.hpp and TestOpenBLASNrm2Anomaly.cpp .
you should change:
const double HighResTimer::m_sys_freq_mhz = 3600;
with the correct frequency for your system.

cxxopts.txt
TestOpenBLASNrm2Anomaly.txt

example:
./a.out --size-range 1000,10000,1000 --ntest 10000

op = ddot

number of test = 10000

n;perf

1000;25045.4
2000;23645.3
3000;16745.6
4000;16813.4
5000;16731.8
6000;17073.6
7000;16958.8
8000;17054.1
9000;16946.9
10000;17089.1

./a.out --nrm2 --size-range 1000,10000,1000 --ntest 10000

op = nrm2

number of test = 10000

n;perf

1000;4342.59
2000;4336.29
3000;4338.41
4000;4342.47
5000;4343.73
6000;4336.11
7000;4340.86
8000;4341.84
9000;4339.63
10000;4337.51

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions