Open
Description
After reading this blog post recently, I was very happily surprised with the quality of the code generation for the following function :
fn middle(xs: &[u32]) -> Option<&u32> {
match xs {
[_, inner @ .., _] => middle(inner),
[x] => Some(x),
[] => None,
}
}
example::middle:
cmp rsi, 2
jb .LBB0_2
add rsi, -2
mov rax, rsi
and rax, -2
lea rdi, [rdi + 2*rax]
add rdi, 4
and esi, 1
.LBB0_2:
xor eax, eax
cmp rsi, 1
cmove rax, rdi
ret
It is amazing what the compiler is able to achieve !
But then, I tried this very slight variation :
pub fn middle(xs: &[u32]) -> Option<u32> {
match xs {
[_, inner @ .., _] => middle(inner),
[x] => Some(*x),
[] => None,
}
}
(the only difference is that we are returning an u32
instead of an &u32
)
And the generated code changes dramatically:
example::middle:
push rax
cmp rsi, 1
jbe .LBB0_1
add rdi, 4
add rsi, -2
call qword ptr [rip + example::middle@GOTPCREL]
pop rcx
ret
.LBB0_1:
cmp rsi, 1
jne .LBB0_2
mov edx, dword ptr [rdi]
mov eax, 1
pop rcx
ret
.LBB0_2:
xor eax, eax
pop rcx
ret
Is there something making the optimization harder to apply in the second case, or is it a bug somewhere in the compiler ?
Metadata
Metadata
Assignees
Labels
Area: Code generationArea: Slice patterns, https://github.com/rust-lang/rust/issues/23121Category: An issue proposing an enhancement or a PR with one.Category: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.Working group: LLVM backend code generation