Skip to content

LLVM pointer range loop / autovectorization regression part two #37276

Open

Description

This is a follow up to #35662. The optimizer regression that resulted in that bug is still not fixed.

That bug created a simplified test case, and fixed it for that case. That's good. However, I have not been able to remove my workaround, so the issue still persists in the original code.

The issue appears for an 8x8 kernel and disappears if the kernel is shrunk to 4x4, so it's somehow related to the sheer size of the function, or the length it goes to in loop unrolling.

Preamble for the code that produces the desired codegen:

let mut ab: [[f32; 8]; 8];
ab = ::std::mem::uninitialized();
loop8!(i, loop8!(j, ab[i][j] = 0.));

What the loop8 macros do is that they expand the expression statically, so it corresponds to 64 assignments.

Initialization part for code which is not optimizing well:

let mut ab: [[f32; 8]; 8];
ab = [[0.; 8]; 8];

Another example which is not optimizing well:

let mut ab: [[f32; 8]; 8];
ab = ::std::mem::uninitialized();
for i in 0..8 {
    for j in 0..8 {
        ab[i][j] = 0.;
    }
}

Full reproducer in the next comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-autovectorizationArea: Autovectorization, which can impact perf or code sizeC-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions