-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Description
I'll preface with saying that the compiler was generating much worse code before 1.52, with up to 4 memcpy calls where there should have been one. So the LLVM 12 upgrade was great :)
Here's the sample:
https://godbolt.org/z/TdPdWqq6q
As you can see, both allocate_naive and allocate_ptr_write generate two memcpy calls (and significant stack usage). Manual unsafe optimization with allocate_separate_write generates a single memcpy. The same can be observed on x86_64: https://godbolt.org/z/6Mq9vKP3s
If I replace pub type Payload = RefCell<[u8; 1000]>; with just [u8; 1000], all three functions seem to generate good code with a single memcpy call: https://godbolt.org/z/8K7s8vbPa
Here's the weirdest part: with the original Payload, if I manually inline the MyStruct type parameter and turn MyStruct into a simple non-generic struct, it still generates bad code on wasm, but on x86_64 correctly optimized all functions into a single memcpy: https://godbolt.org/z/jsdrbGExW
Why does a seemingly-equivalent generic struct appears to generate worse code than a non-generic one?