Skip to content

Initialization of stack-based arrays does not get automatically vectorized. #112459

Open
@TDecking

Description

@TDecking

Consider the following functions:

use std::arch::x86_64::*;

const C: usize = 1024;
const VAL: u64 = 1337;

#[target_feature(enable = "avx2")]
pub unsafe fn very_fast(x: fn(&[__m256i; C / 2])) {
  let slice = [_mm256_set_epi64x(0, 1337, 0, 1337); C / 2];
  x(&slice);
}

#[target_feature(enable = "avx2")]
pub unsafe fn eqfast(x: fn(&[__m128i; C])) {
  let slice = [_mm_set_epi64x(0, 1337); C];
  x(&slice);
}

pub unsafe fn fast(x: fn(&[__m128i; C])) {
  let slice = [_mm_set_epi64x(0, 1337); C];
  x(&slice);
}

pub fn slow(x: fn(&[u128; C])) {
  let slice = [VAL as u128; C];
  x(&slice);
}

godbolt link

Currently (Rust 1.70.0), the following can be observed:

  • slow uses scalar instructions. It could use SSE2 instructions, but doesn't.
  • fast and eqfast use SSE2 instructions. eqfast could use AVX instructions, but doesn't.
  • very_fast uses AVX instructions.

It is likely better if the following was true instead:

  • slow and fast use SSE2 instructions.
  • eqfast and very_fast use AVX instructions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-autovectorizationArea: Autovectorization, which can impact perf or code sizeC-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions