Description
Summary
In one of my recent projects I was in dilemma about declaring constant lookup table (for CRC) as const or static. The table has 256
elements of u8
/u16
/u32
depending on checksum size. The app was meant to run on embedded micro-controller (thumbv6m-none-eabi
) platform with constrained resources (eg. 4KB of RAM memory).
The code usually looks as follows:
static/const LUT: [u16; 256] = [...];
then for each byte of data, the following code is executed:
digest = (digest >> 8) ^ LUT[(digest as u8 ^ input) as usize];
While inspecting the generated assembly I realized that when LUT
is declared as const
, there is a terrible performance overhead on whole operation. For each input byte, the space for whole LUT
is allocated on stack (512 bytes), all contents is copied from origin in .text/.rodata
with __aeabi_memcpy
, single item is taken and the copy is removed. And this for EVERY BYTE.
When the table is static
, the code behaves in optimal way. Base LUT
pointer in .text/.rodata
is advanced with offset and single byte is read from original source. No need to allocate 512 bytes on stack and no need to copy it.
There is also a third option utilizing rvalue_static_promotion
- const LUT: &[u16; 256] = &[...];
which generates the same assembly as static
version.
I strongly believe that the behavior is correct in terms of rust documentation - const
values are something like templates
that will be initialized in place when used.
So this is when the clippy parts starts. There is a lint called large_const_arrays
which should detect such situations, however it has an unexpectedly high threshold of 512KB (128 time more then my available RAM). So it didn't fire for obvious performance upgrade opportunity.
While I understand that for small arrays the initialization could be probably unrolled, I can't see any advantage of using const arrays instead of static arrays. The latter have an advantage of having own address, but the const version also has it (implicitly, pointing to source for copy).
Maybe it will be okay to decrease the limit to some small value, like a couple of cpu words?
Lint Name
large_const_arrays
Reproducer
No response
Version
rustc 1.65.0-nightly (748038961 2022-08-25)
binary: rustc
commit-hash: 7480389611f9d04bd34adf41a2b3029be4eb815e
commit-date: 2022-08-25
host: x86_64-pc-windows-msvc
release: 1.65.0-nightly
LLVM version: 15.0.0