Description
The decompressor rewrite #498 was also targeted at providing assembly rewrites of important parts. Splitting this into a separate function was to make it easy to assembler optimize.
The major one is the sequence execution: https://github.com/klauspost/compress/blob/master/zstd/seqdec.go#L244
A bit of a harder task is the sequence decoding: https://github.com/klauspost/compress/blob/master/zstd/seqdec.go#L102
Right now these 2 functions split execution time approximately 50/50, so both would need to be faster to make a significant difference.
The "combined" version of this will impact single-threaded decoding: https://github.com/klauspost/compress/blob/master/zstd/seqdec.go#L341
My hope was that doing each of the above in modular avo would allow me to combine them into the sync decoder as well. It is basically the two functions above without any intermediate state.