-
Couldn't load subscription status.
- Fork 13.9k
Open
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)Area: SIMD (Single Instruction Multiple Data)C-bugCategory: This is a bug.Category: This is a bug.O-PowerPCTarget: PowerPC processorsTarget: PowerPC processorsT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
#![feature(repr_simd, powerpc_target_feature)]
#![allow(non_camel_case_types)]
#[repr(simd)] pub struct u32x4(u32, u32, u32, u32);
impl u32x4 {
#[inline]
// #[inline(always)]
fn splat(x: u32) -> Self {
u32x4(x, x, x, x)
}
}
#[target_feature(enable = "altivec")]
pub unsafe fn splat_u32x4(x: u32) -> u32x4 {
u32x4::splat(x)
}with #[inline] that code produces a function call within splat_u32x4 (b example::u32x4::splat) to u32x4::splat, which is not eliminated, even though this method is module private. With #[inline(always)], u32x4::splat is inlined into splat_u32x4, and no code for u32x4::splat is generated.
#[inline] should not be needed here, much less #[inline(always)], yet without #[inline(always)] this produces bad codegen.
Removing the #[target_feature] attribute from splat_u32x4 fixes the issue, no #[inline] necessary: godbolt. So there must be some interaction between inlining and target features going on here.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-SIMDArea: SIMD (Single Instruction Multiple Data)Area: SIMD (Single Instruction Multiple Data)C-bugCategory: This is a bug.Category: This is a bug.O-PowerPCTarget: PowerPC processorsTarget: PowerPC processorsT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.