Description
Our aflplusplus fuzzing shows that AArch64 can't compile the following code can't compile using llc -mtriple=aarch64 -global-isel add.ll
, It could complaint: unable to legalize instruction: %10:_(<32 x s16>) = G_ADD %0:_, %1:_ (in function: add_32xi16)
Godbolt: https://godbolt.org/z/7rz7WdbEx
define <32 x i16> @add_32xi16(<32 x i16> %0, <32 x i16> %1) {
%3 = add <32 x i16> %0, %1
ret <32 x i16> %3
}
define <64 x i8> @add_64xi8(<64 x i8> %0, <64 x i8> %1) {
%3 = add <64 x i8> %0, %1
ret <64 x i8> %3
}
A further study shows that in AArch64LegalizeInfo.cpp:116
getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR})
.legalFor({s32, s64, v2s32, v4s32, v4s16, v8s16, v16s8, v8s8})
.scalarizeIf(
[=](const LegalityQuery &Query) {
return Query.Opcode == G_MUL && Query.Types[0] == v2s64;
},
0)
.legalFor({v2s64})
.widenScalarToNextPow2(0)
.clampScalar(0, s32, s64)
.clampNumElements(0, v2s32, v4s32)
.clampNumElements(0, v2s64, v2s64)
.moreElementsToNextPow2(0);
Many vectorized operations are not legal for all five operations.
It seems to us that we should be using the following diff:
@@ -121,8 +121,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
},
0)
.legalFor({v2s64})
- .widenScalarToNextPow2(0)
- .clampScalar(0, s32, s64)
+ .widenScalarOrEltToNextPow2(0)
+ .clampScalarOrElt(0, s32, s64)
.clampNumElements(0, v2s32, v4s32)
.clampNumElements(0, v2s64, v2s64)
.moreElementsToNextPow2(0);
However, many tests failed after we make this change. It seems many other places need to switch from clampScalar
to clampScalarOrElt
to include vector types.
Is this a feature, where we don't want certain vector types to get compiled; or this is a bug.
If it is a feature, can anyone elaborate why do we design it like this? If it is a bug, we can make some quick fixes.