Open
Description
This issue exists purely to track pending items for zvqdotq implementation for RISCV. It's primary purpose is serving as a reminder for me as I suspect I'm going to need to context switch away shortly.
Current status: Most of the basic cases should work for both SLP and LV. LV can't currently generated vqdotsu.vv/vx, SLP can. LV lowering goes through generic DAG, SLP is RISCV custom.
Codegen
- Support vqdotsu via new SDAG node
- Restructure reduce rooted code to use the generic nodes
- Add partial.reduce variant of the add traversal? Or maybe do in VectorCombine instead?
- Consider a vredsum partial reduce -- is this ever better than inloop reduction?
Loop Vectorizer Support
- Basic TTI support in place, generates both scalable and fixed vectors
- Fix the register weight computation (filed separately as [LV] Maximum VF does not consider scaled reductions #141768)
- Track the work being done for reduce (zext x), and enable for RISCV once complete - see [LV] Add support for partial reductions without a binary op #133922
SLP Vectorizer
- Identify any work on partial.reduce (optional given llvm.reduce rooted SDAG)
Cleanup/Rework
- Consider migrating the reduce rooted version to VectorCombine
- Plumb costkind through the TTI hook
IR Optimizations
- add X, partial_reduce zero, Y -> partial_reduce X, Y