Skip to content

[VPlan] Introduce VPInstructionWithType, use instead of VPScalarCast(NFC) #129706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 10, 2025
6 changes: 3 additions & 3 deletions llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
Original file line number Diff line number Diff line change
Expand Up @@ -249,10 +249,10 @@ class VPBuilder {
new VPDerivedIVRecipe(Kind, FPBinOp, Start, Current, Step, Name));
}

VPScalarCastRecipe *createScalarCast(Instruction::CastOps Opcode, VPValue *Op,
Type *ResultTy, DebugLoc DL) {
VPInstruction *createScalarCast(Instruction::CastOps Opcode, VPValue *Op,
Type *ResultTy, DebugLoc DL) {
return tryInsertInstruction(
new VPScalarCastRecipe(Opcode, Op, ResultTy, DL));
new VPInstructionWithType(Opcode, Op, ResultTy, DL));
}

VPWidenCastRecipe *createWidenCast(Instruction::CastOps Opcode, VPValue *Op,
Expand Down
3 changes: 1 addition & 2 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4459,7 +4459,6 @@ static bool willGenerateVectors(VPlan &Plan, ElementCount VF,
switch (R.getVPDefID()) {
case VPDef::VPDerivedIVSC:
case VPDef::VPScalarIVStepsSC:
case VPDef::VPScalarCastSC:
case VPDef::VPReplicateSC:
case VPDef::VPInstructionSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPInstructionSC now also holds for VPInstructionWithType recipes, instead of VPScalarCastSC.

case VPDef::VPCanonicalIVPHISC:
Expand Down Expand Up @@ -10457,8 +10456,8 @@ preparePlanForEpilogueVectorLoop(VPlan &Plan, Loop *L,
assert(all_of(IV->users(),
[](const VPUser *U) {
return isa<VPScalarIVStepsRecipe>(U) ||
isa<VPScalarCastRecipe>(U) ||
isa<VPDerivedIVRecipe>(U) ||
VPInstruction::isCast(U) ||
cast<VPInstruction>(U)->getOpcode() ==
Instruction::Add;
}) &&
Expand Down
99 changes: 50 additions & 49 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -531,7 +531,6 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
case VPRecipeBase::VPWidenIntOrFpInductionSC:
case VPRecipeBase::VPWidenPointerInductionSC:
case VPRecipeBase::VPReductionPHISC:
case VPRecipeBase::VPScalarCastSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VPInstructionSC above now also holds for VPInstructionWithType recipes, instead of VPScalarCastSC.

case VPRecipeBase::VPPartialReductionSC:
return true;
case VPRecipeBase::VPBranchOnMaskSC:
Expand Down Expand Up @@ -1023,6 +1022,56 @@ class VPInstruction : public VPRecipeWithIRFlags,

/// Returns the symbolic name assigned to the VPInstruction.
StringRef getName() const { return Name; }

/// Return true if \p U is a cast.
static bool isCast(const VPUser *U) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more natural to have VPUser::isCast() or VPRecipeBase::isCast(), analogous to Instruction::isCast()?
Rather than the static Instruction::isCast(opcode) which takes an opcode as parameter rather than a [VP]User.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added VPRecipeBase::isScalarCast, thanks

auto *VPI = dyn_cast<VPInstruction>(U);
return VPI && Instruction::isCast(VPI->getOpcode());
}
};

/// A specialization of VPInstruction augmenting it with a dedicated result
/// type, to be used when the opcode and operands of the VPInstruction don't
/// directly determine the result type.
class VPInstructionWithType : public VPInstruction {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A recipe can "have" a single def, and support the interface of VPValue, by inheriting from it.
A (single def) recipe can have IRflags, by being VPRecipeWithFlags.
A (single def with IRFlags) recipe can have an opcode, by being a VPInstruction.
A (single def with IRFlags with opcode) recipe can have a Type, by being a VPInstructionWithType.
Would using multiple inheritance be better to cope with desired combinations of independent "with's", where IRFlags, opcode, Type, are provided as base classes? In particular, allowing recipes to have Type w/o having IRFlags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the Type base class store the Type or just provide an interface for e.g. getScalarType()? Is this something that should be explored in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should split off the flags from VPRecipeWithFlags and generally move in this direction.

For the type, this would just add a type field and a ::getScalarType? I'll check that, although for now, the only user would be VPInstructionWithType, so it might be simpler to do this as follow-up, once more users arise?

Copy link
Contributor

@lukel97 lukel97 Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VPWidenIntrinsic also could use it, although it calls ::getScalarType ::getResultType. But agreed, I think it can be done as a follow up. It would be nice to get this PR in to unblock #129508 and #118638

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could do something like d6eb747, but unfortunately I've not yet been able to figure out how to make this not actually crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to sort this out now using a specialization of CastInfo, which allows isa/dyn_cast<VPWithResultType> to work, effectively turning it into a mixin/trait.

A sketch is here 5c54367, might be worth doing separately>

/// Scalar result type produced by the recipe.
Type *ResultTy;

public:
VPInstructionWithType(unsigned Opcode, ArrayRef<VPValue *> Operands,
Type *ResultTy, DebugLoc DL, const Twine &Name = "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth documenting somewhere that VPInstructionWithType does not have its own VPDef::VPInstructionWithTypeSC identifying it, as do all other recipes, but it uses VPInstructionSC instead. VPInstructions should be identified according to their opcode, whether they are actually of class VPInstruction itself or a subclass thereof.
I.e., there is no VP_CLASSOF_IMPL(VPDef::VPInstructionWithTypeSC) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

: VPInstruction(Opcode, Operands, DL, Name), ResultTy(ResultTy) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about FMF for, say, FPExt, or any other IRFlags supported by VPInstruction casts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, non of the cast take/use them, can be added as follow-up?


static inline bool classof(const VPRecipeBase *R) { return isCast(R); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting out isCast. I needed to do something similar to make VPInstruction::StepVector a VPInstructionWithType in #129508

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently VPInstructionWithType is used for cast recipes only, but if/when that changes the return isCast(R) here should be updated. Some comment should be added to explain that all VPInstructionWithType recipes should be identified here based on their opcode, rather than by checking VPDefID, because the latter is VPInstructionSC rather than VPInstructionWithTypeSC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks


static inline bool classof(const VPUser *R) {
return isa<VPInstructionWithType>(cast<VPRecipeBase>(R));
}

VPInstruction *clone() override {
SmallVector<VPValue *, 2> Operands(operands());
auto *New = new VPInstructionWithType(
getOpcode(), Operands, getResultType(), getDebugLoc(), getName());
New->setUnderlyingValue(getUnderlyingValue());
return New;
}

void execute(VPTransformState &State) override;

/// Return the cost of this VPInstruction.
InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override {
// TODO: Compute accurate cost after retiring the legacy cost model.
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return actual cost here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this only represents scalar casts, which are not corresponding to any input IR, hence are not yet costed (Added a TODO). It will be fleshed out in #129712.

}

Type *getResultType() const { return ResultTy; }

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif
};

/// A recipe to wrap on original IR instruction not to be modified during
Expand Down Expand Up @@ -1211,54 +1260,6 @@ class VPWidenCastRecipe : public VPRecipeWithIRFlags {
Type *getResultType() const { return ResultTy; }
};

/// VPScalarCastRecipe is a recipe to create scalar cast instructions.
class VPScalarCastRecipe : public VPSingleDefRecipe {
Instruction::CastOps Opcode;

Type *ResultTy;

Value *generate(VPTransformState &State);

public:
VPScalarCastRecipe(Instruction::CastOps Opcode, VPValue *Op, Type *ResultTy,
DebugLoc DL)
: VPSingleDefRecipe(VPDef::VPScalarCastSC, {Op}, DL), Opcode(Opcode),
ResultTy(ResultTy) {}

~VPScalarCastRecipe() override = default;

VPScalarCastRecipe *clone() override {
return new VPScalarCastRecipe(Opcode, getOperand(0), ResultTy,
getDebugLoc());
}

VP_CLASSOF_IMPL(VPDef::VPScalarCastSC)

void execute(VPTransformState &State) override;

/// Return the cost of this VPScalarCastRecipe.
InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override {
// TODO: Compute accurate cost after retiring the legacy cost model.
return 0;
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const override;
#endif

/// Returns the result type of the cast.
Type *getResultType() const { return ResultTy; }

bool onlyFirstLaneUsed(const VPValue *Op) const override {
// At the moment, only uniform codegen is implemented.
assert(is_contained(operands(), Op) &&
"Op must be an operand of the recipe");
return true;
}
};

/// A recipe for widening vector intrinsics.
class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
/// ID of the vector intrinsic to widen.
Expand Down
8 changes: 3 additions & 5 deletions llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -261,20 +261,18 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
VPPartialReductionRecipe>([this](const VPRecipeBase *R) {
return inferScalarType(R->getOperand(0));
})
// VPInstructionWithType must be handled before VPInstruction.
.Case<VPInstructionWithType, VPWidenIntrinsicRecipe>(
[](const auto *R) { return R->getResultType(); })
Comment on lines +265 to +266
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that this case should appear before the next to catch VPInstructionWithType before its parent VPInstruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

.Case<VPBlendRecipe, VPInstruction, VPWidenRecipe, VPReplicateRecipe,
VPWidenCallRecipe, VPWidenMemoryRecipe, VPWidenSelectRecipe>(
[this](const auto *R) { return inferScalarTypeForRecipe(R); })
.Case<VPWidenIntrinsicRecipe>([](const VPWidenIntrinsicRecipe *R) {
return R->getResultType();
})
.Case<VPInterleaveRecipe>([V](const VPInterleaveRecipe *R) {
// TODO: Use info from interleave group.
return V->getUnderlyingValue()->getType();
})
.Case<VPWidenCastRecipe>(
[](const VPWidenCastRecipe *R) { return R->getResultType(); })
.Case<VPScalarCastRecipe>(
[](const VPScalarCastRecipe *R) { return R->getResultType(); })
.Case<VPExpandSCEVRecipe>([](const VPExpandSCEVRecipe *R) {
return R->getSCEV()->getType();
})
Expand Down
67 changes: 32 additions & 35 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,6 @@ bool VPRecipeBase::mayHaveSideEffects() const {
switch (getVPDefID()) {
case VPDerivedIVSC:
case VPPredInstPHISC:
case VPScalarCastSC:
case VPVectorEndPointerSC:
return false;
case VPInstructionSC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which now takes care of VPInstructionWithType recipes as well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

Expand Down Expand Up @@ -417,7 +416,7 @@ bool VPInstruction::doesGeneratePerAllLanes() const {
}

bool VPInstruction::canGenerateScalarForFirstLane() const {
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return true;
if (isSingleScalar() || isVectorToScalar())
return true;
Expand Down Expand Up @@ -908,7 +907,7 @@ void VPInstruction::execute(VPTransformState &State) {
}

bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return false;
switch (getOpcode()) {
case Instruction::ExtractElement:
Expand All @@ -932,7 +931,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {

bool VPInstruction::onlyFirstLaneUsed(const VPValue *Op) const {
assert(is_contained(operands(), Op) && "Op must be an operand of the recipe");
if (Instruction::isBinaryOp(getOpcode()))
if (Instruction::isBinaryOp(getOpcode()) || Instruction::isCast(getOpcode()))
return vputils::onlyFirstLaneUsed(this);

switch (getOpcode()) {
Expand Down Expand Up @@ -1070,6 +1069,35 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent,
}
#endif

void VPInstructionWithType::execute(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
assert(vputils::onlyFirstLaneUsed(this) &&
"Codegen only implemented for first lane.");
switch (getOpcode()) {
case Instruction::ZExt:
case Instruction::Trunc: {
Value *Op = State.get(getOperand(0), VPLane(0));
Value *Cast = State.Builder.CreateCast(Instruction::CastOps(getOpcode()),
Op, ResultTy);
State.set(this, Cast, VPLane(0));
break;
}
default:
llvm_unreachable("opcode not implemented yet");
}
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPInstructionWithType::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EMIT ";
printAsOperand(O, SlotTracker);
O << " = " << Instruction::getOpcodeName(getOpcode()) << " ";
printOperands(O, SlotTracker);
O << " to " << *ResultTy;
}
#endif

VPIRInstruction *VPIRInstruction ::create(Instruction &I) {
if (auto *Phi = dyn_cast<PHINode>(&I))
return new VPIRPhi(*Phi);
Expand Down Expand Up @@ -2555,37 +2583,6 @@ void VPReplicateRecipe::print(raw_ostream &O, const Twine &Indent,
}
#endif

Value *VPScalarCastRecipe ::generate(VPTransformState &State) {
assert(vputils::onlyFirstLaneUsed(this) &&
"Codegen only implemented for first lane.");
switch (Opcode) {
case Instruction::SExt:
case Instruction::ZExt:
case Instruction::Trunc: {
// Note: SExt/ZExt not used yet.
Value *Op = State.get(getOperand(0), VPLane(0));
return State.Builder.CreateCast(Instruction::CastOps(Opcode), Op, ResultTy);
}
default:
llvm_unreachable("opcode not implemented yet");
}
}

void VPScalarCastRecipe ::execute(VPTransformState &State) {
State.set(this, generate(State), VPLane(0));
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPScalarCastRecipe ::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "SCALAR-CAST ";
printAsOperand(O, SlotTracker);
O << " = " << Instruction::getOpcodeName(Opcode) << " ";
printOperands(O, SlotTracker);
O << " to " << *ResultTy;
}
#endif

void VPBranchOnMaskRecipe::execute(VPTransformState &State) {
assert(State.Lane && "Branch on Mask works only on single instance.");

Expand Down
6 changes: 5 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,11 @@ bool vputils::isUniformAcrossVFsAndUFs(VPValue *V) {
(isa<LoadInst, StoreInst>(R->getUnderlyingValue())) &&
all_of(R->operands(), isUniformAcrossVFsAndUFs);
})
.Case<VPScalarCastRecipe, VPWidenCastRecipe>([](const auto *R) {
.Case<VPInstruction>([](const auto *VPI) {
return Instruction::isCast(VPI->getOpcode()) &&
all_of(VPI->operands(), isUniformAcrossVFsAndUFs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is Cast, suffice to check first operand only, as before (and assert it has a single operand).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done , thanks

})
.Case<VPWidenCastRecipe>([](const auto *R) {
// A cast is uniform according to its operand.
return isUniformAcrossVFsAndUFs(R->getOperand(0));
})
Expand Down
3 changes: 1 addition & 2 deletions llvm/lib/Transforms/Vectorize/VPlanUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,7 @@ inline bool isUniformAfterVectorization(const VPValue *VPV) {
return true;
if (auto *Rep = dyn_cast<VPReplicateRecipe>(VPV))
return Rep->isUniform();
if (isa<VPWidenGEPRecipe, VPDerivedIVRecipe, VPScalarCastRecipe,
VPBlendRecipe>(VPV))
if (isa<VPWidenGEPRecipe, VPDerivedIVRecipe, VPBlendRecipe>(VPV))
return all_of(VPV->getDefiningRecipe()->operands(),
isUniformAfterVectorization);
if (auto *VPI = dyn_cast<VPInstruction>(VPV))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now also covers the VPInstructionWithType case, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's one of the main advantages of sharing the VPInstruction base class

Expand Down
1 change: 0 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanValue.h
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,6 @@ class VPDef {
VPReductionSC,
VPPartialReductionSC,
VPReplicateSC,
VPScalarCastSC,
VPScalarIVStepsSC,
VPVectorPointerSC,
VPVectorEndPointerSC,
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,8 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
[&](const VPRecipeBase *S) { return VerifyEVLUse(*S, 2); })
.Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe>(
[&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
.Case<VPScalarCastRecipe>(
[&](const VPScalarCastRecipe *S) { return VerifyEVLUse(*S, 0); })
.Case<VPInstructionWithType>(
[&](const auto *S) { return VerifyEVLUse(*S, 0); })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[&](const auto *S) { return VerifyEVLUse(*S, 0); })
[&](const VPRecipeBase *S) { return VerifyEVLUse(*S, 0); })

consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to VPInstructionWithType,. thanks

.Case<VPInstruction>([&](const VPInstruction *I) {
if (I->getOpcode() == Instruction::PHI)
return VerifyEVLUse(*I, 1);
Expand Down
Loading