Skip to content

[VPlan] Introduce VPInstructionWithType, use instead of VPScalarCast(NFC) #129706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 10, 2025
6 changes: 3 additions & 3 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
Original file line number Diff line number Diff line change
Expand Up @@ -246,10 +246,10 @@ class VPBuilder {
new VPDerivedIVRecipe(Kind, FPBinOp, Start, Current, Step, Name));
}

VPScalarCastRecipe *createScalarCast(Instruction::CastOps Opcode, VPValue *Op,
Type *ResultTy, DebugLoc DL) {
VPInstruction *createScalarCast(Instruction::CastOps Opcode, VPValue *Op,
Type *ResultTy, DebugLoc DL) {
return tryInsertInstruction(
new VPScalarCastRecipe(Opcode, Op, ResultTy, DL));
new VPInstructionWithType(Opcode, Op, ResultTy, DL));
}

VPWidenCastRecipe *createWidenCast(Instruction::CastOps Opcode, VPValue *Op,
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4489,7 +4489,6 @@ static bool willGenerateVectors(VPlan &Plan, ElementCount VF,
switch (R.getVPDefID()) {
case VPDef::VPDerivedIVSC:
case VPDef::VPScalarIVStepsSC:
case VPDef::VPScalarCastSC:
case VPDef::VPReplicateSC:
case VPDef::VPInstructionSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPInstructionSC now also holds for VPInstructionWithType recipes, instead of VPScalarCastSC.

case VPDef::VPCanonicalIVPHISC:
Expand Down Expand Up @@ -10424,8 +10423,9 @@ preparePlanForEpilogueVectorLoop(VPlan &Plan, Loop *L,
assert(all_of(IV->users(),
[](const VPUser *U) {
return isa<VPScalarIVStepsRecipe>(U) ||
isa<VPScalarCastRecipe>(U) ||
isa<VPDerivedIVRecipe>(U) ||
Instruction::isCast(
cast<VPInstruction>(U)->getOpcode()) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth wrapping in VPInstruction::isCast()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks!

cast<VPInstruction>(U)->getOpcode() ==
Instruction::Add;
}) &&
Expand Down
98 changes: 49 additions & 49 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,6 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
case VPRecipeBase::VPWidenIntOrFpInductionSC:
case VPRecipeBase::VPWidenPointerInductionSC:
case VPRecipeBase::VPReductionPHISC:
case VPRecipeBase::VPScalarCastSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPInstructionSC above now also holds for VPInstructionWithType recipes, instead of VPScalarCastSC.

case VPRecipeBase::VPScalarPHISC:
case VPRecipeBase::VPPartialReductionSC:
return true;
Expand Down Expand Up @@ -1026,6 +1025,55 @@ class VPInstruction : public VPRecipeWithIRFlags,
StringRef getName() const { return Name; }
};

/// A specialization of VPInstruction augmenting it with a dedicated result
/// type, to be used when the opcode and operands of the VPInstruction don't
/// directly determine the result type.
class VPInstructionWithType : public VPInstruction {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A recipe can "have" a single def, and support the interface of VPValue, by inheriting from it.
A (single def) recipe can have IRflags, by being VPRecipeWithFlags.
A (single def with IRFlags) recipe can have an opcode, by being a VPInstruction.
A (single def with IRFlags with opcode) recipe can have a Type, by being a VPInstructionWithType.
Would using multiple inheritance be better to cope with desired combinations of independent "with's", where IRFlags, opcode, Type, are provided as base classes? In particular, allowing recipes to have Type w/o having IRFlags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the Type base class store the Type or just provide an interface for e.g. getScalarType()? Is this something that should be explored in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should split off the flags from VPRecipeWithFlags and generally move in this direction.

For the type, this would just add a type field and a ::getScalarType? I'll check that, although for now, the only user would be VPInstructionWithType, so it might be simpler to do this as follow-up, once more users arise?

Copy link
Contributor

@lukel97 lukel97 Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VPWidenIntrinsic also could use it, although it calls ::getScalarType ::getResultType. But agreed, I think it can be done as a follow up. It would be nice to get this PR in to unblock #129508 and #118638

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could do something like d6eb747, but unfortunately I've not yet been able to figure out how to make this not actually crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to sort this out now using a specialization of CastInfo, which allows isa/dyn_cast<VPWithResultType> to work, effectively turning it into a mixin/trait.

A sketch is here 5c54367, might be worth doing separately>

/// Scalar result type produced by the recipe.
Type *ResultTy;

Value *generate(VPTransformState &State);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document. May be a bit confusing being non-virtual nor override, worth a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I folded it into execute directly, thanks


public:
VPInstructionWithType(unsigned Opcode, ArrayRef<VPValue *> Operands,
Type *ResultTy, DebugLoc DL, const Twine &Name = "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth documenting somewhere that VPInstructionWithType does not have its own VPDef::VPInstructionWithTypeSC identifying it, as do all other recipes, but it uses VPInstructionSC instead. VPInstructions should be identified according to their opcode, whether they are actually of class VPInstruction itself or a subclass thereof.
I.e., there is no VP_CLASSOF_IMPL(VPDef::VPInstructionWithTypeSC) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

: VPInstruction(Opcode, Operands, DL, Name), ResultTy(ResultTy) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about FMF for, say, FPExt, or any other IRFlags supported by VPInstruction casts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, non of the cast take/use them, can be added as follow-up?


static inline bool classof(const VPRecipeBase *R) {
auto *VPI = dyn_cast<VPInstruction>(R);
return VPI && Instruction::isCast(VPI->getOpcode());
}

static inline bool classof(const VPUser *R) {
return isa<VPInstructionWithType>(cast<VPRecipeBase>(R));
}

VPInstruction *clone() override {
auto *New =
new VPInstructionWithType(getOpcode(), {getOperand(0)}, getResultType(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumed to have a single operand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now yes, but removed restriction, thanks!

getDebugLoc(), getName());
New->setUnderlyingValue(getUnderlyingValue());
return New;
}

void execute(VPTransformState &State) override;

/// Return the cost of this VPIRInstruction.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Return the cost of this VPIRInstruction.
/// Return the cost of this VPInstruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed thanks

InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override {
// TODO: Compute accurate cost after retiring the legacy cost model.
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return actual cost here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this only represents scalar casts, which are not corresponding to any input IR, hence are not yet costed (Added a TODO). It will be fleshed out in #129712.

}

Type *getResultType() const { return ResultTy; }

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif
};

/// A recipe to wrap on original IR instruction not to be modified during
/// execution, execept for PHIs. For PHIs, a single VPValue operand is allowed,
/// and it is used to add a new incoming value for the single predecessor VPBB.
Expand Down Expand Up @@ -1183,54 +1231,6 @@ class VPWidenCastRecipe : public VPRecipeWithIRFlags {
Type *getResultType() const { return ResultTy; }
};

/// VPScalarCastRecipe is a recipe to create scalar cast instructions.
class VPScalarCastRecipe : public VPSingleDefRecipe {
Instruction::CastOps Opcode;

Type *ResultTy;

Value *generate(VPTransformState &State);

public:
VPScalarCastRecipe(Instruction::CastOps Opcode, VPValue *Op, Type *ResultTy,
DebugLoc DL)
: VPSingleDefRecipe(VPDef::VPScalarCastSC, {Op}, DL), Opcode(Opcode),
ResultTy(ResultTy) {}

~VPScalarCastRecipe() override = default;

VPScalarCastRecipe *clone() override {
return new VPScalarCastRecipe(Opcode, getOperand(0), ResultTy,
getDebugLoc());
}

VP_CLASSOF_IMPL(VPDef::VPScalarCastSC)

void execute(VPTransformState &State) override;

/// Return the cost of this VPScalarCastRecipe.
InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override {
// TODO: Compute accurate cost after retiring the legacy cost model.
return 0;
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif

/// Returns the result type of the cast.
Type *getResultType() const { return ResultTy; }

bool onlyFirstLaneUsed(const VPValue *Op) const override {
// At the moment, only uniform codegen is implemented.
assert(is_contained(operands(), Op) &&
"Op must be an operand of the recipe");
return true;
}
};

/// A recipe for widening vector intrinsics.
class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
/// ID of the vector intrinsic to widen.
Expand Down
7 changes: 2 additions & 5 deletions llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -252,20 +252,17 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
VPPartialReductionRecipe>([this](const VPRecipeBase *R) {
return inferScalarType(R->getOperand(0));
})
.Case<VPInstructionWithType, VPWidenIntrinsicRecipe>(
[](const auto *R) { return R->getResultType(); })
Comment on lines +265 to +266
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that this case should appear before the next to catch VPInstructionWithType before its parent VPInstruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

.Case<VPBlendRecipe, VPInstruction, VPWidenRecipe, VPReplicateRecipe,
VPWidenCallRecipe, VPWidenMemoryRecipe, VPWidenSelectRecipe>(
[this](const auto *R) { return inferScalarTypeForRecipe(R); })
.Case<VPWidenIntrinsicRecipe>([](const VPWidenIntrinsicRecipe *R) {
return R->getResultType();
})
.Case<VPInterleaveRecipe>([V](const VPInterleaveRecipe *R) {
// TODO: Use info from interleave group.
return V->getUnderlyingValue()->getType();
})
.Case<VPWidenCastRecipe>(
[](const VPWidenCastRecipe *R) { return R->getResultType(); })
.Case<VPScalarCastRecipe>(
[](const VPScalarCastRecipe *R) { return R->getResultType(); })
.Case<VPExpandSCEVRecipe>([](const VPExpandSCEVRecipe *R) {
return R->getSCEV()->getType();
})
Expand Down
72 changes: 36 additions & 36 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,6 @@ bool VPRecipeBase::mayHaveSideEffects() const {
switch (getVPDefID()) {
case VPDerivedIVSC:
case VPPredInstPHISC:
case VPScalarCastSC:
case VPReverseVectorPointerSC:
return false;
case VPInstructionSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which now takes care of VPInstructionWithType recipes as well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

Expand Down Expand Up @@ -412,7 +411,7 @@ bool VPInstruction::doesGeneratePerAllLanes() const {
}

bool VPInstruction::canGenerateScalarForFirstLane() const {
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return true;
if (isSingleScalar() || isVectorToScalar())
return true;
Expand Down Expand Up @@ -821,7 +820,7 @@ void VPInstruction::execute(VPTransformState &State) {
}

bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return false;
switch (getOpcode()) {
case Instruction::ICmp:
Expand All @@ -843,7 +842,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {

bool VPInstruction::onlyFirstLaneUsed(const VPValue *Op) const {
assert(is_contained(operands(), Op) && "Op must be an operand of the recipe");
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return vputils::onlyFirstLaneUsed(this);

switch (getOpcode()) {
Expand Down Expand Up @@ -972,6 +971,39 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent,
}
#endif

Value *VPInstructionWithType::generate(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
assert(vputils::onlyFirstLaneUsed(this) &&
"Codegen only implemented for first lane.");
switch (getOpcode()) {
case Instruction::SExt:
case Instruction::ZExt:
case Instruction::Trunc: {
// Note: SExt/ZExt not used yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then leave them as default until they are, to avoid dead code? Admittedly unrelated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limited to Trunc and ZExt, which both are used, thanks

Value *Op = State.get(getOperand(0), VPLane(0));
return State.Builder.CreateCast(Instruction::CastOps(getOpcode()), Op,
ResultTy);
}
default:
llvm_unreachable("opcode not implemented yet");
}
}

void VPInstructionWithType::execute(VPTransformState &State) {
State.set(this, generate(State), VPLane(0));
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPInstructionWithType::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EMIT ";
printAsOperand(O, SlotTracker);
O << " = " << Instruction::getOpcodeName(getOpcode()) << " ";
printOperands(O, SlotTracker);
O << " to " << *ResultTy;
}
#endif

void VPIRInstruction::execute(VPTransformState &State) {
assert((isa<PHINode>(&I) || getNumOperands() == 0) &&
"Only PHINodes can have extra operands");
Expand Down Expand Up @@ -2445,38 +2477,6 @@ void VPReplicateRecipe::print(raw_ostream &O, const Twine &Indent,
}
#endif

Value *VPScalarCastRecipe ::generate(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
assert(vputils::onlyFirstLaneUsed(this) &&
"Codegen only implemented for first lane.");
switch (Opcode) {
case Instruction::SExt:
case Instruction::ZExt:
case Instruction::Trunc: {
// Note: SExt/ZExt not used yet.
Value *Op = State.get(getOperand(0), VPLane(0));
return State.Builder.CreateCast(Instruction::CastOps(Opcode), Op, ResultTy);
}
default:
llvm_unreachable("opcode not implemented yet");
}
}

void VPScalarCastRecipe ::execute(VPTransformState &State) {
State.set(this, generate(State), VPLane(0));
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPScalarCastRecipe ::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "SCALAR-CAST ";
printAsOperand(O, SlotTracker);
O << " = " << Instruction::getOpcodeName(Opcode) << " ";
printOperands(O, SlotTracker);
O << " to " << *ResultTy;
}
#endif

void VPBranchOnMaskRecipe::execute(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
assert(State.Lane && "Branch on Mask works only on single instance.");
Expand Down
7 changes: 6 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,12 @@ bool vputils::isUniformAcrossVFsAndUFs(VPValue *V) {
all_of(R->operands(),
[](VPValue *Op) { return isUniformAcrossVFsAndUFs(Op); });
})
.Case<VPScalarCastRecipe, VPWidenCastRecipe>([](const auto *R) {
.Case<VPInstruction>([](const auto *VPI) {
return Instruction::isCast(VPI->getOpcode())
? all_of(VPI->operands(), isUniformAcrossVFsAndUFs)
: false;
})
.Case<VPWidenCastRecipe>([](const auto *R) {
// A cast is uniform according to its operand.
return isUniformAcrossVFsAndUFs(R->getOperand(0));
})
Expand Down
1 change: 0 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanValue.h
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,6 @@ class VPDef {
VPReductionSC,
VPPartialReductionSC,
VPReplicateSC,
VPScalarCastSC,
VPScalarIVStepsSC,
VPVectorPointerSC,
VPReverseVectorPointerSC,
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,8 +146,8 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
.Case<VPWidenLoadEVLRecipe, VPReverseVectorPointerRecipe,
VPScalarPHIRecipe>(
[&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
.Case<VPScalarCastRecipe>(
[&](const VPScalarCastRecipe *S) { return VerifyEVLUse(*S, 0); })
.Case<VPInstructionWithType>(
[&](const auto *S) { return VerifyEVLUse(*S, 0); })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[&](const auto *S) { return VerifyEVLUse(*S, 0); })
[&](const VPRecipeBase *S) { return VerifyEVLUse(*S, 0); })

consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to VPInstructionWithType,. thanks

.Case<VPInstruction>([&](const VPInstruction *I) {
if (I->getOpcode() != Instruction::Add) {
errs() << "EVL is used as an operand in non-VPInstruction::Add\n";
Expand Down
Loading
Loading