Closed
Description
PR #85706 causes massive performance issues for not-inlined leaf functions since the frame pointer and the link register are now always saved and restored at the beginning and end of a function, even if this function e.g. just returns a constant. In contrast to the PR description this is not what clang does on Macos aarch64. The clang default on Macos is "frame-pointer"="non-leaf"
instead.
E.g. clang compiles
unsigned long test(unsigned long num) {
return num % 64;
}
to this LLVM IR (with clang -c test.c -S -emit-llvm -O3
)
[...]
; Function Attrs: norecurse nounwind readnone ssp uwtable
define i64 @test(i64 %0) local_unnamed_addr #0 {
%2 = and i64 %0, 63
ret i64 %2
}
attributes #0 = { norecurse nounwind readnone ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="non-leaf" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "probe-stack"="__chkstk_darwin" "stack-protector-buffer-size"="8" "target-cpu"="apple-a12" "target-features"="+aes,+crc,+crypto,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+v8.3a,+zcm,+zcz" "unsafe-fp-math"="false" "use-soft-float"="false" }
[...]
which leads to this assembly:
_test: ; @test
.cfi_startproc
; %bb.0:
and x0, x0, #0x3f
ret
.cfi_endproc
Since the inclusion of the PR Rust Nightly sets "frame-pointer"="all"
, which causes this assembly to be generated:
_test: ; @test
.cfi_startproc
; %bb.0:
stp x29, x30, [sp, #-16]! ; 16-byte Folded Spill
mov x29, sp
.cfi_def_cfa w29, 16
.cfi_offset w30, -8
.cfi_offset w29, -16
and x0, x0, #0x3f
ldp x29, x30, [sp], #16 ; 16-byte Folded Reload
ret
.cfi_endproc
@jrmuizel: FYI
Metadata
Metadata
Assignees
Labels
Area: Code generationCategory: This is a bug.Issue: Problems and improvements with respect to performance of generated code.Operating system: iOSOperating system: macOSHigh priorityRelevant to the compiler team, which will review and decide on the PR/issue.Performance or correctness regression from stable to nightly.