-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Description
The ReplaceWithVeclib (replace-with-veclib) pass constructs the scalar overload name used to look up the veclib call by passing all the input argument types (converted to scalar) to Intrinsic::getName.
| ? Intrinsic::getName(IID, ScalarArgTypes, II->getModule()) |
This is incorrect. Instead, it should be using isVectorIntrinsicWithOverloadTypeAtArg to determine whether an argument type contributes to the overload, including the return argument (-1).
This mostly went unnoticed because most intrinsics supported so far only had one argument, and that argument type matched the return type. For other intrinsics, such as pow or atan2 with two arguments, this would construct a name like llvm.pow.f64.f64. This won't match the name in the table defined in VecFuncs.def, causing it to fail replacement silently.
There is a note in llvm/test/CodeGen/AArch64/replace-with-veclib-armpl.ll noting the bug above the llvm.pow checks, and the checks were written to match the input (no translation). I didn't see the note until I debugged the pass after running into the bug while trying to implement this for atan2.
llvm-project/llvm/test/CodeGen/AArch64/replace-with-veclib-armpl.ll
Lines 337 to 340 in e9b7a09
| ; There is a bug in the replace-with-veclib pass, and for intrinsics which take | |
| ; more than one arguments, but has just one overloaded type, it incorrectly | |
| ; reconstructs the scalar name, for pow specifically it is searching for: | |
| ; llvm.pow.f64.f64 and llvm.pow.f32.f32 |