Description
Hello,
We notice a strange performance of dnrm2 compared to ddot: the performance of dnrm2 is very low compared to ddot.
This is observed with v0.3.23 and v0.3.18, other versions weren't tested.
Below 2 source files that reproduce the problem (sorry for txt extension)
Rename cxxopts.txt and TestOpenBLASNrm2Anomaly.txt to cxxopts.hpp and TestOpenBLASNrm2Anomaly.cpp .
you should change:
const double HighResTimer::m_sys_freq_mhz = 3600;
with the correct frequency for your system.
cxxopts.txt
TestOpenBLASNrm2Anomaly.txt
example:
./a.out --size-range 1000,10000,1000 --ntest 10000
op = ddot
number of test = 10000
n;perf
1000;25045.4
2000;23645.3
3000;16745.6
4000;16813.4
5000;16731.8
6000;17073.6
7000;16958.8
8000;17054.1
9000;16946.9
10000;17089.1
./a.out --nrm2 --size-range 1000,10000,1000 --ntest 10000
op = nrm2
number of test = 10000
n;perf
1000;4342.59
2000;4336.29
3000;4338.41
4000;4342.47
5000;4343.73
6000;4336.11
7000;4340.86
8000;4341.84
9000;4339.63
10000;4337.51