Saturating truncation produces extra instructions

See: https://llvm.godbolt.org/z/4KdejfEsG

The following two functions:

```
declare <4 x i16> @llvm.smax.v4i16(<4 x i16>, <4 x i16>)
declare <4 x i16> @llvm.smin.v4i16(<4 x i16>, <4 x i16>)
declare <8 x i16> @llvm.smax.v8i16(<8 x i16>, <8 x i16>)
declare <8 x i16> @llvm.smin.v8i16(<8 x i16>, <8 x i16>)

define <4 x i8> @saturate4(<4 x i16> %x) {
  %1 = tail call <4 x i16> @llvm.smax.v4i16(<4 x i16> %x, <4 x i16> zeroinitializer)
  %2 = tail call <4 x i16> @llvm.smin.v4i16(<4 x i16> %1, <4 x i16> <i16 255, i16 255, i16 255, i16 255>)
  %3 = trunc <4 x i16> %2 to <4 x i8>
  ret <4 x i8> %3
}

define <8 x i8> @saturate8(<8 x i16> %x) {
  %1 = tail call <8 x i16> @llvm.smax.v8i16(<8 x i16> %x, <8 x i16> zeroinitializer)
  %2 = tail call <8 x i16> @llvm.smin.v8i16(<8 x i16> %1, <8 x i16> <i16 255, i16 255, i16 255, i16 255, i16 255, i16 255, i16 255, i16 255>)
  %3 = trunc <8 x i16> %2 to <8 x i8>
  ret <8 x i8> %3
}
```

produce the following:

```
.LCPI0_0:
        .short  255                             # 0xff
        .short  255                             # 0xff
        .short  255                             # 0xff
        .short  255                             # 0xff
        .zero   2
        .zero   2
        .zero   2
        .zero   2
saturate4:                              # @saturate4
        pxor    xmm1, xmm1
        pmaxsw  xmm0, xmm1
        pminsw  xmm0, xmmword ptr [rip + .LCPI0_0]
        packuswb        xmm0, xmm0
        ret
saturate8:                              # @saturate8
        packuswb        xmm0, xmm0
        ret
```

The `saturate4` function produces extra min/max.  I believe the `trunc` followed by `shufflevector` is being optimized before the saturating truncation could be detected.

Discovered in https://github.com/rust-lang/portable-simd/issues/369#issuecomment-1751589313

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Saturating truncation produces extra instructions #68466

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Saturating truncation produces extra instructions #68466

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions