Add Infrastructure for SHGEMV#5485
Conversation
This adds all the relevant bits and pieces to add a `shgemv` path as well as a future `hgemm`/`hgemv` path in a similar model to `sb` and `b` interfaces. I've also fixed a few bits and pieces around `shgemm` which didn't build in a few situations.
|
Thanks (pity about the duplicate work though) |
Yeah, sorry about that, I was thinking about something else and got a bit carried away seeing what was missing here 🙀 |
|
@martin-frbg I will leave |
|
Getting these compiler warnings.... |
|
Should probably be done like this https://stackoverflow.com/questions/42074035/how-to-deal-with-clangs-3-9-wexpansion-to-defined-warning |
|
I see a couple of places in test/Makefile where BUILD_BFLOAT16 has been added but I don't see the same for BUILD_HFLOAT16. It looks like we have support for SBGEMM but not SHGEMM? |
|
Your conversion of the outputs for SBGEMM/V seems wrong since you are casting from F16 to BF32 with TO_F32 |
Can you point me to the line @ChipKerchner ? The block is: |
|
He added review notes to the code changes, but weirdly I can only see them if I click on the corresponding notification in the gh web interface. As far as I can tell, these are indeed cases where the macro does not perform any actual conversion. |
|
I see the TO_F32 conversions here: Bad SBGEMV conversion here (a 2nd one in gemv_t): Bad SBGEMM conversion here: |
|
Argh, guessing we need to add a |
|
Interestingly, this does not lead to failures in our tests - at least not on (emulated) RISCV, but AFAICT the code changes in question originate from your earlier "fix bfloat conversion for Neoverse" PR |
I don't think so, that didn't touch the generic kernels: https://github.com/OpenMathLib/OpenBLAS/pull/5483/files I'd imagine only the RISC-V CI is running the generic kernels? As the targets I was working on both have their own variants. |
This adds all the relevant bits and pieces to add a
shgemvpath as well as a futurehgemm/hgemvpath in a similar model tosbandbinterfaces.I've also fixed a few bits and pieces around
shgemmwhich didn't build in a few situations.