Unnecessary stack usage

I have the following simple SIMD-powered function which tests whether points are inside one of bounding boxes:
```rust
pub unsafe fn foo(
    x: &[__m256i; N],
    y: &[__m256i; N],
    z: &[__m256i; N],
    bboxes: &[[__m256i; 6]],
) -> [__m256i; N] {
    let mut res = [_mm256_setzero_si256(); N];
    for bbox in bboxes {
        for i in 0..N {
            let tx = _mm256_and_si256(
                _mm256_cmpgt_epi32(x[i], bbox[0]),
                _mm256_cmpgt_epi32(bbox[1], x[i]),
            );
            let ty = _mm256_and_si256(
                _mm256_cmpgt_epi32(y[i], bbox[2]),
                _mm256_cmpgt_epi32(bbox[3], y[i]),
            );
            let t = _mm256_and_si256(tx, ty);
            let tz = _mm256_and_si256(
                _mm256_cmpgt_epi32(z[i], bbox[4]),
                _mm256_cmpgt_epi32(bbox[5], z[i]),
            );
            let t = _mm256_and_si256(t, tz);
            res[i] = _mm256_or_si256(res[i], t);
        }
    }
    res
}
```
By inspecting the [generated assembly](https://rust.godbolt.org/z/6fahjYKGP) we can see that for some reason it caches coordinates to stack and reads them from it each iteration instead of using the input pointers. The same behavior can be [observed](https://rust.godbolt.org/z/c5hGxsrvP) for a function which processes coordinate slices.  This caching looks quite redundant to me, especially considering that `noalias` is enabled (i.e. compiler should know that memory at which coordinates are stored can not change during function execution).

It looks like LLVM correctly moves coordinate loads from the inner loop using its infinite virtual registers. And it's exactly the behavior we want when there is enough physical registers. But when it's not true, it spills virtual register values to stack instead of relying on the original locations.

On Rust 1.51 code from the first link does not have this issue, but not from the second one.

UPD: See this [comment](https://github.com/rust-lang/rust/issues/88930#issuecomment-1734619004) for additional example affecting cryptographic code..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unnecessary stack usage #88930

newpavlov
openedon Sep 14, 2021

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unnecessary stack usage #88930

Description

newpavlovopenedon Sep 14, 2021

Metadata