Skip to content

Apple platforms: Disabled frame pointer elimination causes perf issues and is not in line with what clang does #86196

Closed
@hkratz

Description

@hkratz

PR #85706 causes massive performance issues for not-inlined leaf functions since the frame pointer and the link register are now always saved and restored at the beginning and end of a function, even if this function e.g. just returns a constant. In contrast to the PR description this is not what clang does on Macos aarch64. The clang default on Macos is "frame-pointer"="non-leaf" instead.

E.g. clang compiles

unsigned long test(unsigned long num) {
    return num % 64;
}

to this LLVM IR (with clang -c test.c -S -emit-llvm -O3)

[...]
; Function Attrs: norecurse nounwind readnone ssp uwtable
define i64 @test(i64 %0) local_unnamed_addr #0 {
  %2 = and i64 %0, 63
  ret i64 %2
}

attributes #0 = { norecurse nounwind readnone ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="non-leaf" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "probe-stack"="__chkstk_darwin" "stack-protector-buffer-size"="8" "target-cpu"="apple-a12" "target-features"="+aes,+crc,+crypto,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+v8.3a,+zcm,+zcz" "unsafe-fp-math"="false" "use-soft-float"="false" }
[...]

which leads to this assembly:

_test:                                  ; @test
	.cfi_startproc
; %bb.0:
	and	x0, x0, #0x3f
	ret
	.cfi_endproc

Since the inclusion of the PR Rust Nightly sets "frame-pointer"="all", which causes this assembly to be generated:

_test:                                  ; @test
   .cfi_startproc
; %bb.0:
   stp	x29, x30, [sp, #-16]!           ; 16-byte Folded Spill
   mov	x29, sp
   .cfi_def_cfa w29, 16
   .cfi_offset w30, -8
   .cfi_offset w29, -16
   and	x0, x0, #0x3f
   ldp	x29, x30, [sp], #16             ; 16-byte Folded Reload
   ret
   .cfi_endproc

@jrmuizel: FYI

Metadata

Metadata

Assignees

Labels

A-codegenArea: Code generationC-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.O-iosOperating system: iOSO-macosOperating system: macOSP-highHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.regression-from-stable-to-nightlyPerformance or correctness regression from stable to nightly.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions