We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduced code: https://godbolt.org/z/3MYzE1v7M
long arr [20]; void f() { for (int i = 0; i < 20; i += 1) { arr [i] = 1; } }
clang -O3:
f(): mov qword ptr [rip + arr], 1 mov qword ptr [rip + arr+8], 1 mov qword ptr [rip + arr+16], 1 mov qword ptr [rip + arr+24], 1 mov qword ptr [rip + arr+32], 1 mov qword ptr [rip + arr+40], 1 mov qword ptr [rip + arr+48], 1 mov qword ptr [rip + arr+56], 1 mov qword ptr [rip + arr+64], 1 mov qword ptr [rip + arr+72], 1 mov qword ptr [rip + arr+80], 1 mov qword ptr [rip + arr+88], 1 mov qword ptr [rip + arr+96], 1 mov qword ptr [rip + arr+104], 1 mov qword ptr [rip + arr+112], 1 mov qword ptr [rip + arr+120], 1 mov qword ptr [rip + arr+128], 1 mov qword ptr [rip + arr+136], 1 mov qword ptr [rip + arr+144], 1 mov qword ptr [rip + arr+152], 1 ret
expected code (clang 15 or gcc):
f(): movaps xmm0, xmmword ptr [rip + .LCPI0_0] movaps xmmword ptr [rip + arr], xmm0 movaps xmmword ptr [rip + arr+16], xmm0 movaps xmmword ptr [rip + arr+32], xmm0 movaps xmmword ptr [rip + arr+48], xmm0 movaps xmmword ptr [rip + arr+64], xmm0 movaps xmmword ptr [rip + arr+80], xmm0 movaps xmmword ptr [rip + arr+96], xmm0 movaps xmmword ptr [rip + arr+112], xmm0 movaps xmmword ptr [rip + arr+128], xmm0 movaps xmmword ptr [rip + arr+144], xmm0 ret
The text was updated successfully, but these errors were encountered:
Looks like it was fully unrolled before the loop vectorizer saw it. Need to check why SLP didn't catch it.
Sorry, something went wrong.
SLP does vectorize using 256-bit vectors with -mavx. So SLP is capable of vectorizing it. Maybe a cost model issue?
-mavx
CC: @RKSimon @alexey-bataev
SLP does vectorize using 256-bit vectors with -mavx. So SLP is capable of vectorizing it. Maybe a cost model issue? CC: @RKSimon @alexey-bataev
Cost model decides it is not profitable, vectorized with -slp-threshold=-1
antoniofrighetto
No branches or pull requests
reduced code:
https://godbolt.org/z/3MYzE1v7M
clang -O3:
expected code (clang 15 or gcc):
The text was updated successfully, but these errors were encountered: