Open
Description
I tried this code:
#![no_std]
fn edit(d: &mut [u8]) {
d[40] = 1;
d[80] = 1;
d[120] = 1;
d[160] = 1;
}
// Returning the array is fine
pub fn array_good() -> [u8; 200] {
let mut d = [0u8; 200];
edit(&mut d);
d
}
// Setting the bytes within the option is also fine
pub fn option_good() -> Option<[u8; 200]> {
let mut o = Some([0u8; 200]);
if let Some(ref mut d) = o {
edit(d);
}
o
}
// When returning an initialized array in an Option,
// the initialization gets split into multiple separate memset/memclr calls,
// just to optimize away a few redundant byte clears.
pub fn option_bad() -> Option<[u8; 200]> {
Some(array_good())
}
On godbolt: https://godbolt.org/z/e8z3ornos
I expected to see this happen: In all three cases I expect roughly similar code with a single call to memclr
followed by four stores.
Instead, this happened: The first implementation splits the initialization into five different calls to memclr
. This is unnecessary and inefficient.
example::array_good:
push {r4, r6, r7, lr}
add r7, sp, #8
movs r1, #200
mov r4, r0
bl __aeabi_memclr
movs r0, #1
strb.w r0, [r4, #160]
strb.w r0, [r4, #120]
strb.w r0, [r4, #80]
strb.w r0, [r4, #40]
pop {r4, r6, r7, pc}
example::option_good:
push {r4, r6, r7, lr}
add r7, sp, #8
mov r4, r0
adds r0, #1
movs r1, #200
bl __aeabi_memclr
movs r0, #1
strb.w r0, [r4, #161]
strb.w r0, [r4, #121]
strb.w r0, [r4, #81]
strb.w r0, [r4, #41]
strb r0, [r4]
pop {r4, r6, r7, pc}
example::option_bad:
push {r4, r6, r7, lr}
add r7, sp, #8
mov r4, r0
adds r0, #1
movs r1, #40
bl __aeabi_memclr
add.w r0, r4, #42
movs r1, #39
bl __aeabi_memclr
add.w r0, r4, #82
movs r1, #39
bl __aeabi_memclr
add.w r0, r4, #122
movs r1, #39
bl __aeabi_memclr
add.w r0, r4, #162
movs r1, #39
bl __aeabi_memclr
movs r0, #1
strb.w r0, [r4, #161]
strb.w r0, [r4, #121]
strb.w r0, [r4, #81]
strb.w r0, [r4, #41]
strb r0, [r4]
pop {r4, r6, r7, pc}
The behavior is not target specific. The behavior is the same on x86, for larger arrays and initialized segments it will also call memset
.
-C opt-level=1
doesn't inline and thus suppresses the issue.
Meta
rustc --version --verbose
:
rustc 1.53.0-nightly (b84932674 2021-04-21)
This behavior is present between 1.45.0 and current nightly. Before 1.45.0 all three implementations split the initialization.
I noticed this when looking at #83022 (comment)