`match` an `std::cmp::Ordering` generates less optimized code in nightly

I'm interested in this comment:

https://github.com/rust-lang/rust/blob/3824017f8e57ce9dac6d69a1ce07f41c7034f82d/library/core/src/slice/mod.rs#L2204-L2217

So I did some testing:

---

The following code snippet is the "bad" one (`match` an `std::cmp::Ordering`) in above comment, but with `f` manually inlined.

```rust
use std::cmp::Ordering;
use std::cmp::Ordering::*;

pub fn custom_binary_search_1<'a>(slice: &'a [u8], x: &u8) -> Result<usize, usize> {
    let s = slice;
    let mut left = 0;
    let mut right = s.len();
    while left < right {
        // never overflow because `slice::len()` max is `isize::MAX`.
        let mid = (left + right) / 2;
        // SAFETY: the call is made safe by the following invariants:
        // - `mid >= 0`
        // - `mid < size`: `mid` is limited by `[left; right)` bound.
        let cmp = unsafe{ s.get_unchecked(mid).cmp(x) };
        match cmp {
            Less => left = mid + 1,
            Greater => right = mid,
            Equal => return Ok(mid),
        }
    }
    Err(left)
}
```

---

In version 1.53.0-beta.12 (2021-06-12 e7a67cc91667fd2a6679), release mode, it generates the following x86 assembly (related parts only):

```asm
        jmp	.LBB0_3
.LBB0_7:
	add	rdx, 1                     # mid += 1
	mov	rcx, rdx                   # left = mid
	cmp	rcx, rsi                   # while left < right
	jae	.LBB0_9                    # (end loop)
.LBB0_3:
	lea	rdx, [rcx + rsi]           # mid = left + right
	shr	rdx                        # mid /= 2
	cmp	byte ptr [rdi + rdx], r8b  # match s[mid].cmp(x)
	jb	.LBB0_7                    # Less => ...
	je	.LBB0_6                    # Equal => ...
	mov	rsi, rdx                   # Greater => right = mid
	cmp	rcx, rsi                   # while left < right
	jb	.LBB0_3                    # (continue loop)
.LBB0_6:
	xor	eax, eax                   # return (0, mid)
	ret
```

(Which is identical to manually expanding to `if..else if..else`, however it's not the point of this issue)

---

In version 1.55.0-nightly (2021-06-20 e82b65026dcadf0998e3), release mode, the generated assembly code is:

```asm
	mov	r9, -1                       # constant of std::cmp::Ordering::Less
	jmp	.LBB0_3

.LBB0_5:                                #   in Loop: Header=BB0_3 Depth=1
	mov	rcx, rdx                     # left = mid         
	add	rcx, 1                       # left += 1
	mov	rdx, rsi                     # temp = right

.LBB0_6:                                #   in Loop: Header=BB0_3 Depth=1
	mov	rsi, rdx                     # right = mid
	cmp	rcx, rdx                     # while left < right
	jae	.LBB0_7                      # (end loop)

.LBB0_3:                                # =>This Inner Loop Header: Depth=1
	lea	rdx, [rcx + rsi]             # mid = left + right
	shr	rdx                          # mid /= 2
	xor	eax, eax                     # cmp = 0
	cmp	byte ptr [rdi + rdx], r8b
	setne	al                           # cmp = s[mid] == x ? 0 : 1
	cmovb	rax, r9                      # cmp = s[mid] < x ? -1 : temp
	cmp	rax, -1                      # if cmp == Less
	je	.LBB0_5                      # Less => ...
# %bb.4:                                #   in Loop: Header=BB0_3 Depth=1
	cmp	rax, 1                       # if cmp == Greater
	je	.LBB0_6                      # Greater => ...
# %bb.9:
	ret
```

It's same as the "bad" result in original comment, and obviously much unoptimized.

I think `match` an `Ordering` should be a quite common use case, so shouldn't be deoptimized.

---

### Version it worked on

Rust Playground: 1.53.0

Rust Playground: 1.53.0-beta.12 (2021-06-12 e7a67cc91667fd2a6679)

### Version with regression

a55748ffe94e71f841c7b1d752779b0db138b342

	let cmp = f(unsafe { self.get_unchecked(mid) });

	// The reason why we use if/else control flow rather than match
	// is because match reorders comparison operations, which is perf sensitive.
	// This is x86 asm for u8: https://rust.godbolt.org/z/8Y8Pra.
	if cmp == Less {
	left = mid + 1;
	} else if cmp == Greater {
	right = mid;
	} else {
	// SAFETY: same as the `get_unchecked` above
	unsafe { crate::intrinsics::assume(mid < self.len()) };
	return Ok(mid);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`match` an `std::cmp::Ordering` generates less optimized code in nightly #86511

Version it worked on

Version with regression

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

match an std::cmp::Ordering generates less optimized code in nightly #86511

Description

Version it worked on

Version with regression

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`match` an `std::cmp::Ordering` generates less optimized code in nightly #86511