Closed as not planned
Description
(This should be an enhancement request).
pub fn foo(a: &mut [u8; 25]) {
for i in 0 .. a.len() {
if i % 5 > 0 { a[i - 1] = 0; }
if i % 5 < 4 { a[i + 1] = 0; }
}
}
Using rustc v.1.58.0-nightly cc946fc 2021-11-18, compiling in release mode (even with aggressive compilation flags) gives an asm like:
foo:
push rax
mov rax, rdi
mov edi, 1
xor edx, edx
jmp .LBB0_1
.LBB0_4:
add rdi, 1
add dl, 1
cmp rdi, 26
je .LBB0_5
.LBB0_1:
movzx ecx, dl
imul ecx, ecx, 205
shr ecx, 10
lea ecx, [rcx + 4*rcx]
neg ecx
movzx esi, cl
lea rcx, [rdi + rsi]
cmp cl, 1
je .LBB0_7
lea rcx, [rdi - 2]
cmp rcx, 25
jae .LBB0_3
lea rcx, [rdi + rsi]
add rcx, -1
mov byte ptr [rax + rdi - 2], 0
cmp cl, 4
jae .LBB0_4
.LBB0_7:
lea rcx, [rdi - 1]
cmp rcx, 23
ja .LBB0_9
mov byte ptr [rax + rdi], 0
jmp .LBB0_4
.LBB0_5:
pop rax
ret
.LBB0_9:
lea rdx, [rip + .L__unnamed_1]
mov esi, 25
call qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
ud2
.LBB0_3:
lea rdx, [rip + .L__unnamed_2]
mov esi, 25
mov rdi, rcx
call qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
ud2
LLVM can't remove the two array bound tests, despite they should be safe when the array length is divisible by 5.
Currently LLVM is able to remove those tests if fully unrolls that loop compiling with -O -C llvm-args=-unroll-threshold=500
:
foo:
mov word ptr [rdi], 0
mov byte ptr [rdi + 2], 0
mov dword ptr [rdi + 3], 0
mov byte ptr [rdi + 7], 0
mov word ptr [rdi + 8], 0
mov dword ptr [rdi + 10], 0
mov byte ptr [rdi + 14], 0
mov dword ptr [rdi + 15], 0
mov word ptr [rdi + 18], 0
mov word ptr [rdi + 21], 0
mov byte ptr [rdi + 20], 0
mov word ptr [rdi + 22], 0
mov byte ptr [rdi + 24], 0
ret
I'd like LLVM to remove both bound tests. Expecially the first one because inbound(i - 1) if i % 5 > 0 && a.len() > 0
.
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: `[T; N]`Area: Code generationCategory: An issue proposing an enhancement or a PR with one.Issue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.