Closed
Description
The following test is not generating sdot/udot (depending on types) when the tripcount is variable, or if the loop is not unwound (the cutoff is at 60 trips, though may differ per target).
#include <stdint.h>
int32_t f(int8_t * restrict x, int8_t * restrict y, int n)
{
int32_t r = 0;
for (int j = 0; j < n; ++j) {
r += x[j] * y[j];
}
return r;
}
clang (does not generate sdot): https://godbolt.org/z/KznKr1Kh8
gcc (generates sdot): https://godbolt.org/z/h1xqM1xMc
If you replace 'n' with some value < 60, you will see the instructions. Same problem with unsigned. This appears to be a problem with neoverse-n1, neoverse-v1 and neoverse-v2