Skip to content

Redundant memset in convolution kernel #142469

Open
@yugr

Description

@yugr

This is probly a minor issue but I thought I'd still report it

This code

pub fn conv(v: &[i32], kernel: &[i32]) -> Vec<i32> {
    assert!(v.len() >= kernel.len());
    let n = v.len() - kernel.len() + 1;
    let mut ans = vec![0; n];
    for i in 0..n {
        let mut acc = 0;
        for j in 0..kernel.len() {
            acc += v[i + j] * kernel[j];
        }
        ans[i] = acc;
    }
    ans
}

(inspired by URLO post) allocates resulting vector using __rust_alloc_zeroed:

$ rustc +nightly --crate-type=rlib -O -C target-cpu=native repro.rs
$ objdump -rd librepro.rlib | c++filt
...
  63:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 6a <repro::conv::he33c9a47571c7bab+0x6a>
                        66: R_X86_64_GOTPCREL   __rust_no_alloc_shim_is_unstable-0x4
  6a:   0f b6 00                movzbl (%rax),%eax
  6d:   be 04 00 00 00          mov    $0x4,%esi
  72:   4c 89 c7                mov    %r8,%rdi
  75:   4d 89 c4                mov    %r8,%r12
  78:   ff 15 00 00 00 00       callq  *0x0(%rip)        # 7e <repro::conv::he33c9a47571c7bab+0x7e>
                        7a: R_X86_64_GOTPCREL   _RNvCs9khzRxRKOtI_7___rustc19___rust_alloc_zeroed-0x4

but then also calls memset for kernel.len() == 0 case:

 1d1:   4c 89 e7                mov    %r12,%rdi
 1d4:   31 f6                   xor    %esi,%esi
 1d6:   4c 89 c2                mov    %r8,%rdx
 1d9:   ff 15 00 00 00 00       callq  *0x0(%rip)        # 1df <repro::conv::he33c9a47571c7bab+0x1df>
                        1db: R_X86_64_GOTPCREL  memset-0x4

This memset is redundant because memory returned by __rust_alloc_zeroed is guaranteed to be already zero.

Tested with nightly:

$ rustc +nightly --version
rustc 1.89.0-nightly (6ccd44760 2025-06-08)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions