Skip to content

Suboptimal inlining decisions #49541

Open

Description

I suspect this can happen in more cases, but here is how I observed this:

pub fn foo() -> Box<[u8]> {
    vec![0].into_boxed_slice()
}

This compiles to:

  sub rsp, 56
  lea rdx, [rsp + 8]
  mov edi, 1
  mov esi, 1
  call __rust_alloc@PLT
  test rax, rax
  je .LBB2_1
  mov byte ptr [rax], 0
  mov edx, 1
  add rsp, 56
  ret
.LBB2_1:
  (snip oom handling)

Which is pretty much to the point.

Now duplicate the function, so that you now have two functions calling into_boxed_slice(), and the compiler decides not to inline it at all anymore. Which:

  • adds the full blown Vec::into_boxed_slice implementation (63 lines of assembly)
  • adds ptr::drop_in_place
  • and changes the function above to:
  sub rsp, 56
  lea rdx, [rsp + 8]
  mov edi, 1
  mov esi, 1
  call __rust_alloc@PLT
  test rax, rax
  je .LBB4_1
  mov byte ptr [rax], 0
  mov qword ptr [rsp + 8], rax
  mov qword ptr [rsp + 16], 1
  mov qword ptr [rsp + 24], 1
  lea rdi, [rsp + 8]
  call <alloc::vec::Vec<T>>::into_boxed_slice
  add rsp, 56
  ret
.LBB4_1:
  (snip oom handling)

The threshold to stop inlining seems pretty low for this particular case, and even if it might make sense for some uses across the codebase to not be inlined, when the result of inlining is clearly beneficial, it would be good if we could still inline the calls where it's a win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.WG-llvmWorking group: LLVM backend code generation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions