Description
I'm interested in this comment:
rust/library/core/src/slice/mod.rs
Lines 2204 to 2217 in 3824017
So I did some testing:
The following code snippet is the "bad" one (match
an std::cmp::Ordering
) in above comment, but with f
manually inlined.
use std::cmp::Ordering;
use std::cmp::Ordering::*;
pub fn custom_binary_search_1<'a>(slice: &'a [u8], x: &u8) -> Result<usize, usize> {
let s = slice;
let mut left = 0;
let mut right = s.len();
while left < right {
// never overflow because `slice::len()` max is `isize::MAX`.
let mid = (left + right) / 2;
// SAFETY: the call is made safe by the following invariants:
// - `mid >= 0`
// - `mid < size`: `mid` is limited by `[left; right)` bound.
let cmp = unsafe{ s.get_unchecked(mid).cmp(x) };
match cmp {
Less => left = mid + 1,
Greater => right = mid,
Equal => return Ok(mid),
}
}
Err(left)
}
In version 1.53.0-beta.12 (2021-06-12 e7a67cc), release mode, it generates the following x86 assembly (related parts only):
jmp .LBB0_3
.LBB0_7:
add rdx, 1 # mid += 1
mov rcx, rdx # left = mid
cmp rcx, rsi # while left < right
jae .LBB0_9 # (end loop)
.LBB0_3:
lea rdx, [rcx + rsi] # mid = left + right
shr rdx # mid /= 2
cmp byte ptr [rdi + rdx], r8b # match s[mid].cmp(x)
jb .LBB0_7 # Less => ...
je .LBB0_6 # Equal => ...
mov rsi, rdx # Greater => right = mid
cmp rcx, rsi # while left < right
jb .LBB0_3 # (continue loop)
.LBB0_6:
xor eax, eax # return (0, mid)
ret
(Which is identical to manually expanding to if..else if..else
, however it's not the point of this issue)
In version 1.55.0-nightly (2021-06-20 e82b650), release mode, the generated assembly code is:
mov r9, -1 # constant of std::cmp::Ordering::Less
jmp .LBB0_3
.LBB0_5: # in Loop: Header=BB0_3 Depth=1
mov rcx, rdx # left = mid
add rcx, 1 # left += 1
mov rdx, rsi # temp = right
.LBB0_6: # in Loop: Header=BB0_3 Depth=1
mov rsi, rdx # right = mid
cmp rcx, rdx # while left < right
jae .LBB0_7 # (end loop)
.LBB0_3: # =>This Inner Loop Header: Depth=1
lea rdx, [rcx + rsi] # mid = left + right
shr rdx # mid /= 2
xor eax, eax # cmp = 0
cmp byte ptr [rdi + rdx], r8b
setne al # cmp = s[mid] == x ? 0 : 1
cmovb rax, r9 # cmp = s[mid] < x ? -1 : temp
cmp rax, -1 # if cmp == Less
je .LBB0_5 # Less => ...
# %bb.4: # in Loop: Header=BB0_3 Depth=1
cmp rax, 1 # if cmp == Greater
je .LBB0_6 # Greater => ...
# %bb.9:
ret
It's same as the "bad" result in original comment, and obviously much unoptimized.
I think match
an Ordering
should be a quite common use case, so shouldn't be deoptimized.
Version it worked on
Rust Playground: 1.53.0
Rust Playground: 1.53.0-beta.12 (2021-06-12 e7a67cc)