In #13382 I'm applying an optimization where array.fill for i8-element arrays to be optimized to a memset on the host. This is relatively easy to do because memory.fill already has the infrastructure for this on the host and array.fill is just reusing it. The intended benefit of this is that we get to use the host's vectorized routines for array.fill as opposed to a per-byte-loop within CLIF. This benefit, however, is also theoretically applicable for elements of other sizes (e.g. all the way up to 128-bits). Implementing this, however, would require new libcalls on the host, for example memory.fill{16,32,64,128}.
This is doable without too too much effort, but this was left out of #13382 because it's not clear whether this is worth it. It'd likely be useful to investigate sibling peer compilers to see what they do in the face of array.fill or similar for larger-than-8-bit-types.
In #13382 I'm applying an optimization where
array.fillfori8-element arrays to be optimized to amemseton the host. This is relatively easy to do becausememory.fillalready has the infrastructure for this on the host andarray.fillis just reusing it. The intended benefit of this is that we get to use the host's vectorized routines forarray.fillas opposed to a per-byte-loop within CLIF. This benefit, however, is also theoretically applicable for elements of other sizes (e.g. all the way up to 128-bits). Implementing this, however, would require new libcalls on the host, for examplememory.fill{16,32,64,128}.This is doable without too too much effort, but this was left out of #13382 because it's not clear whether this is worth it. It'd likely be useful to investigate sibling peer compilers to see what they do in the face of
array.fillor similar for larger-than-8-bit-types.