Skip to content

[arm64] JIT: Fold "A * B + C" to MADD/MSUB #61037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions src/coreclr/jit/codegenlinear.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1608,9 +1608,14 @@ void CodeGen::genConsumeRegs(GenTree* tree)
}
#endif // FEATURE_HW_INTRINSICS
#endif // TARGET_XARCH
else if (tree->OperIs(GT_BITCAST))
else if (tree->OperIs(GT_BITCAST, GT_NEG))
{
genConsumeReg(tree->gtGetOp1());
genConsumeRegs(tree->gtGetOp1());
}
else if (tree->OperIs(GT_MUL))
{
genConsumeRegs(tree->gtGetOp1());
genConsumeRegs(tree->gtGetOp2());
}
else
{
Expand Down
38 changes: 38 additions & 0 deletions src/coreclr/jit/emitarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13597,6 +13597,44 @@ regNumber emitter::emitInsTernary(instruction ins, emitAttr attr, GenTree* dst,
// src2 can only be a reg
assert(!src2->isContained());
}
else if ((src1->OperIs(GT_MUL) && src1->isContained()) || (src2->OperIs(GT_MUL) && src2->isContained()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the src1->gtGetOp1() or src1->gtGetOp2() would be the one that is marked contained in lower and that should be checked here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

madd fot GT_ADD can only be emitted when GT_MUL is contained. So "isContaind" here basically means lower approved this tree to be optimized into madd. GT_MUL can be not contained e.g. if it has gtOverflow set

{
assert(ins == INS_add);

GenTree* mul;
GenTree* c;
if (src1->OperIs(GT_MUL))
{
mul = src1;
c = src2;
}
else
{
mul = src2;
c = src1;
}

GenTree* a = mul->gtGetOp1();
GenTree* b = mul->gtGetOp2();

assert(varTypeIsIntegral(mul) && !mul->gtOverflow());

bool msub = false;
if (a->OperIs(GT_NEG) && a->isContained())
{
a = a->gtGetOp1();
msub = true;
}
if (b->OperIs(GT_NEG) && b->isContained())
{
b = b->gtGetOp1();
msub = !msub; // it's either "a * -b" or "-a * -b" which is the same as "a * b"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// it's either "a * -b" or "-a * -b" which is the same as "a * b"

This comment is little misleading...I can see how -a * -b is a * b, but then we will still use msub so probably we should not confuse by saying "same as a * b"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we will still use msub

actually for -a * -b this code will emit madd (msub = !msub will return false)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah...I misread this if as else if. Never mind.

}

emitIns_R_R_R_R(msub ? INS_msub : INS_madd, attr, dst->GetRegNum(), a->GetRegNum(), b->GetRegNum(),
c->GetRegNum());
return dst->GetRegNum();
}
else // not floating point
{
// src2 can be immed or reg
Expand Down
41 changes: 41 additions & 0 deletions src/coreclr/jit/lowerarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1567,6 +1567,47 @@ void Lowering::ContainCheckBinary(GenTreeOp* node)
{
// Check and make op2 contained (if it is a containable immediate)
CheckImmedAndMakeContained(node, node->gtOp2);

#ifdef TARGET_ARM64
// Find "a * b + c" or "c + a * b" in order to emit MADD/MSUB
if (comp->opts.OptimizationEnabled() && varTypeIsIntegral(node) && !node->isContained() && node->OperIs(GT_ADD) &&
!node->gtOverflow() && (node->gtGetOp1()->OperIs(GT_MUL) || node->gtGetOp2()->OperIs(GT_MUL)))
{
GenTree* mul;
GenTree* c;
if (node->gtGetOp1()->OperIs(GT_MUL))
{
mul = node->gtGetOp1();
c = node->gtGetOp2();
}
else
{
mul = node->gtGetOp2();
c = node->gtGetOp1();
}

GenTree* a = mul->gtGetOp1();
GenTree* b = mul->gtGetOp2();

if (!mul->isContained() && !mul->gtOverflow() && !a->isContained() && !b->isContained() && !c->isContained() &&
varTypeIsIntegral(mul))
{
if (a->OperIs(GT_NEG) && !a->gtGetOp1()->isContained() && !a->gtGetOp1()->IsRegOptional())
{
// "-a * b + c" to MSUB
MakeSrcContained(mul, a);
}
if (b->OperIs(GT_NEG) && !b->gtGetOp1()->isContained())
{
// "a * -b + c" to MSUB
MakeSrcContained(mul, b);
}
// If both 'a' and 'b' are GT_NEG - MADD will be emitted.

MakeSrcContained(node, mul);
}
}
#endif
}

//------------------------------------------------------------------------
Expand Down
1 change: 0 additions & 1 deletion src/coreclr/jit/lsraarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,6 @@ int LinearScan::BuildNode(GenTree* tree)
// everything is made explicit by adding casts.
assert(tree->gtGetOp1()->TypeGet() == tree->gtGetOp2()->TypeGet());
}

FALLTHROUGH;

case GT_AND:
Expand Down
12 changes: 12 additions & 0 deletions src/coreclr/jit/lsrabuild.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3071,6 +3071,18 @@ int LinearScan::BuildOperandUses(GenTree* node, regMaskTP candidates)
return 1;
}
#endif // FEATURE_HW_INTRINSICS
#ifdef TARGET_ARM64
if (node->OperIs(GT_MUL))
{
// Can be contained for MultiplyAdd on arm64
return BuildBinaryUses(node->AsOp(), candidates);
}
if (node->OperIs(GT_NEG))
{
// Can be contained for MultiplyAdd on arm64
return BuildOperandUses(node->gtGetOp1(), candidates);
}
#endif

return 0;
}
Expand Down