Skip to content

Commit

Permalink
[RISC-V][LoongArch64] JIT: pass structs according to floating-point c…
Browse files Browse the repository at this point in the history
…alling convention properly (#104237)

* Replace StructFloatFieldInfoFlags with FpStructInRegistersInfo which carries also exact field sizes and offsets

* Replace StructFloatFieldInfoFlags with FpStruct::Flags in profiler

* Remove FpStructInRegistersInfo::FromOldFlags()

* Fix duplicating types in HandleInlineArray

* Remove signedness from FpStruct::IntKind because most probably we won't need it

* Remove old StructFloatFieldInfoFlags calculating routine

* Typo in TARGET_LOONGARCH64

* Remove m_returnedFpFieldOffsets from ArgIterator

* Add missing ENREGISTERED_PARAMTYPE_MAXSIZE condition to C# version of FpStruct info calculation

* Rename RISCV64PassStructInRegister to match settled casing for RiscV in class names

* Update hardcoded flags for float and double in ArgIteratorTemplate::ComputeReturnFlags()

This fixes JIT/HardwareIntrinsics/General/Vector* tests.

* Fix build on other platforms

* Update LoongArch to use FpStructInRegistersInfo

* Remove unused old flag masks

* LoongArch64 typo

Co-authored-by: Qiao Pengcheng <qiaopengcheng@loongson.cn>

* Missing FpStruct namespace

Co-authored-by: Qiao Pengcheng <qiaopengcheng@loongson.cn>

* Missing FpStruct namespace

Co-authored-by: Qiao Pengcheng <qiaopengcheng@loongson.cn>

* Missing FpStruct namespace

Co-authored-by: Qiao Pengcheng <qiaopengcheng@loongson.cn>

* Use FpStruct namespace everywhere in JIT

* Add tests

* Fix passing FP structs in JIT

* Remove CallArgABIInformation.StructFloatFieldType and Offset because they were write-only. Replace with offset from new ABI info

* Remove CallArgABIInformation.StructDesc on System V because it was write-only

* Remove unused local vars from ABIPassingInformation::Classify

* Pass variables after the tested struct to detect potential register/stack slot over-allocation

* Add test for two-field but one-slot struct. Fix assertion bug that it uncovered

* Add sanity check for empty struct passing, disable on RISC-V and System V

* Format

* JIT review

* Update StructFloatFieldInfoFlags description

* Revert to hitherto instruction set order as it's not the point of this PR

* Disable Test_Empty_Sanity on arm32 due to bug in Clang

* Add ActiveIssue on arm and arm64

* Exclude arm64 from Test_PackedEmptyFloatLong*_RiscV

* Unify get{LoongArch,RiscV}64PassFpStructInRegistersInfo JIT interfaces

* Use JIT_TO_EE_TRANSITION instead of _LEAF because MethodTable::GetFpStructInRegistersInfo may throw

* Remove FpStruct::IntKind, we should have similar info in ClassLayout in JIT

* Change JIT interface to return a struct similar to CORINFO_SWIFT_LOWERING to facilitate code unification in the future

* Change JIT to use new Swift-like getFpStructLowering

* Cache CORINFO_FPSTRUCT_LOWERING

* Update LoongArch classifier to use CORINFO_FPSTRUCT_LOWERING

* Update StructFloatInfoFlags doc comment on C#

* Add arm32 clang issue

* Move StructFloatFieldInfoFlags and FpStructInRegistersInfo out of the JIT interface

* Merge LoongArch and RISC-V AOT calculation of FpStructInRegistersInfo because they were identical. Move it to Common\Internal/Runtime because it's no longer exposed in JIT interface.

* Don't zero-initialize CORINFO_FPSTRUCT_LOWERING

* Update LoongArch classifier

* Fix assert for two-field structs smaller than 8 bytes, add test
  • Loading branch information
tomeksowi authored Aug 1, 2024
1 parent 03b4c79 commit bc9cd27
Show file tree
Hide file tree
Showing 11 changed files with 1,468 additions and 154 deletions.
35 changes: 24 additions & 11 deletions src/coreclr/jit/codegencommon.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3201,6 +3201,13 @@ var_types CodeGen::genParamStackType(LclVarDsc* dsc, const ABIPassingSegment& se
// can always use the full register size here. This allows us to
// use stp more often.
return TYP_I_IMPL;
#elif defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
// On RISC-V/LoongArch structs passed according to floating-point calling convention are enregistered one
// field per register regardless of the field layout in memory, so the small int load/store instructions
// must not be upsized to 4 bytes, otherwise for example:
// * struct { struct{} e1,e2,e3; byte b; float f; } -- 4-byte store for 'b' would trash 'f'
// * struct { float f; struct{} e1,e2,e3; byte b; } -- 4-byte store for 'b' would trash adjacent stack slot
return seg.GetRegisterType();
#else
return genActualType(seg.GetRegisterType());
#endif
Expand Down Expand Up @@ -7390,18 +7397,19 @@ void CodeGen::genStructReturn(GenTree* treeNode)
assert(varDsc->lvIsMultiRegRet);

#if defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
// On LoongArch64, for a struct like "{ int, double }", "retTypeDesc" will be "{ TYP_INT, TYP_DOUBLE }",
// i. e. not include the padding for the first field, and so the general loop below won't work.
var_types type = retTypeDesc.GetReturnRegType(0);
regNumber toReg = retTypeDesc.GetABIReturnReg(0, compiler->info.compCallConv);
GetEmitter()->emitIns_R_S(ins_Load(type), emitTypeSize(type), toReg, lclNode->GetLclNum(), 0);
var_types type = retTypeDesc.GetReturnRegType(0);
unsigned offset = retTypeDesc.GetReturnFieldOffset(0);
regNumber toReg = retTypeDesc.GetABIReturnReg(0, compiler->info.compCallConv);

GetEmitter()->emitIns_R_S(ins_Load(type), emitTypeSize(type), toReg, lclNode->GetLclNum(), offset);
if (regCount > 1)
{
assert(regCount == 2);
int offset = genTypeSize(type);
type = retTypeDesc.GetReturnRegType(1);
offset = (int)((unsigned int)offset < genTypeSize(type) ? genTypeSize(type) : offset);
toReg = retTypeDesc.GetABIReturnReg(1, compiler->info.compCallConv);
assert(offset + genTypeSize(type) <= retTypeDesc.GetReturnFieldOffset(1));
type = retTypeDesc.GetReturnRegType(1);
offset = retTypeDesc.GetReturnFieldOffset(1);
toReg = retTypeDesc.GetABIReturnReg(1, compiler->info.compCallConv);

GetEmitter()->emitIns_R_S(ins_Load(type), emitTypeSize(type), toReg, lclNode->GetLclNum(), offset);
}
#else // !TARGET_LOONGARCH64 && !TARGET_RISCV64
Expand Down Expand Up @@ -7779,6 +7787,11 @@ void CodeGen::genMultiRegStoreToLocal(GenTreeLclVar* lclNode)
assert(regCount == varDsc->lvFieldCnt);
}

#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
// genMultiRegStoreToLocal is only used for calls on RISC-V and LoongArch
const ReturnTypeDesc* returnTypeDesc = actualOp1->AsCall()->GetReturnTypeDesc();
#endif

#ifdef SWIFT_SUPPORT
const uint32_t* offsets = nullptr;
if (actualOp1->IsCall() && (actualOp1->AsCall()->GetUnmanagedCallConv() == CorInfoCallConvExtension::Swift))
Expand Down Expand Up @@ -7829,8 +7842,8 @@ void CodeGen::genMultiRegStoreToLocal(GenTreeLclVar* lclNode)
else
{
#if defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
// should consider the padding field within a struct.
offset = (offset % genTypeSize(srcType)) ? AlignUp(offset, genTypeSize(srcType)) : offset;
// Should consider the padding, empty struct fields, etc within a struct.
offset = returnTypeDesc->GetReturnFieldOffset(i);
#endif
#ifdef SWIFT_SUPPORT
if (offsets != nullptr)
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -743,7 +743,7 @@ var_types Compiler::getArgTypeForStruct(CORINFO_CLASS_HANDLE clsHnd,

// Otherwise we pass this struct by value on the stack
// setup wbPassType and useType indicate that this is passed by value according to the X86/ARM32 ABI
// On LOONGARCH64 struct that is 1-16 bytes is passed by value in one/two register(s)
// On LOONGARCH64 and RISCV64 struct that is 1-16 bytes is passed by value in one/two register(s)
howToPassStruct = SPK_ByValue;
useType = TYP_STRUCT;

Expand Down
32 changes: 20 additions & 12 deletions src/coreclr/jit/gentree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29489,6 +29489,15 @@ void ReturnTypeDesc::InitializeStructReturnType(Compiler* comp,
assert(returnType != TYP_UNKNOWN);
assert(returnType != TYP_STRUCT);
m_regType[0] = returnType;

#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
const CORINFO_FPSTRUCT_LOWERING* lowering = comp->GetFpStructLowering(retClsHnd);
if (!lowering->byIntegerCallConv)
{
assert(lowering->numLoweredElements == 1);
m_fieldOffset[0] = lowering->offsets[0];
}
#endif // defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
break;
}

Expand Down Expand Up @@ -29554,37 +29563,36 @@ void ReturnTypeDesc::InitializeStructReturnType(Compiler* comp,
}

#elif defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
assert((structSize >= TARGET_POINTER_SIZE) && (structSize <= (2 * TARGET_POINTER_SIZE)));
assert(structSize > sizeof(float));
assert(structSize <= (2 * TARGET_POINTER_SIZE));
BYTE gcPtrs[2] = {TYPE_GC_NONE, TYPE_GC_NONE};
comp->info.compCompHnd->getClassGClayout(retClsHnd, &gcPtrs[0]);
const CORINFO_FPSTRUCT_LOWERING* lowering = comp->GetFpStructLowering(retClsHnd);
if (!lowering->byIntegerCallConv)
{
comp->compFloatingPointUsed = true;
assert(lowering->numLoweredElements == MAX_RET_REG_COUNT);
var_types types[MAX_RET_REG_COUNT] = {JITtype2varType(lowering->loweredElements[0]),
JITtype2varType(lowering->loweredElements[1])};
assert(varTypeIsFloating(types[0]) || varTypeIsFloating(types[1]));
assert((structSize > 8) == ((genTypeSize(types[0]) == 8) || (genTypeSize(types[1]) == 8)));
static_assert(MAX_RET_REG_COUNT == MAX_FPSTRUCT_LOWERED_ELEMENTS, "");
for (unsigned i = 0; i < MAX_RET_REG_COUNT; ++i)
{
if (varTypeIsFloating(types[i]))
{
m_regType[i] = types[i];
}
else
m_regType[i] = JITtype2varType(lowering->loweredElements[i]);
m_fieldOffset[i] = lowering->offsets[i];
if (m_regType[i] == TYP_LONG && ((m_fieldOffset[i] % TARGET_POINTER_SIZE) == 0))
{
assert(varTypeIsIntegralOrI(types[i]));
m_regType[i] = (genTypeSize(types[i]) == 8) ? comp->getJitGCType(gcPtrs[i]) : TYP_INT;
unsigned slot = m_fieldOffset[i] / TARGET_POINTER_SIZE;
m_regType[i] = comp->getJitGCType(gcPtrs[slot]);
}
}
assert(varTypeIsFloating(m_regType[0]) || varTypeIsFloating(m_regType[1]));
}
else
{
for (unsigned i = 0; i < 2; ++i)
{
m_regType[i] = comp->getJitGCType(gcPtrs[i]);
}
m_fieldOffset[0] = 0;
m_fieldOffset[1] = TARGET_POINTER_SIZE;
}

#elif defined(TARGET_X86)
Expand Down
49 changes: 33 additions & 16 deletions src/coreclr/jit/gentree.h
Original file line number Diff line number Diff line change
Expand Up @@ -4275,6 +4275,13 @@ struct ReturnTypeDesc
private:
var_types m_regType[MAX_RET_REG_COUNT];

#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
// Structs according to hardware floating-point calling convention are passed as two logical fields, each in
// separate register, disregarding struct layout such as packing, custom alignment, padding with empty structs, etc.
// We need size (can be derived from m_regType) & offset of each field for memory load/stores
unsigned m_fieldOffset[MAX_RET_REG_COUNT];
#endif

#ifdef DEBUG
bool m_inited;
#endif
Expand Down Expand Up @@ -4306,6 +4313,9 @@ struct ReturnTypeDesc
for (unsigned i = 0; i < MAX_RET_REG_COUNT; ++i)
{
m_regType[i] = TYP_UNKNOWN;
#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
m_fieldOffset[i] = 0;
#endif
}
#ifdef DEBUG
m_inited = false;
Expand Down Expand Up @@ -4351,8 +4361,11 @@ struct ReturnTypeDesc
for (unsigned i = regCount + 1; i < MAX_RET_REG_COUNT; ++i)
{
assert(m_regType[i] == TYP_UNKNOWN);
}
#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
assert(m_fieldOffset[i] == 0);
#endif
}
#endif // DEBUG

return regCount;
}
Expand Down Expand Up @@ -4401,6 +4414,25 @@ struct ReturnTypeDesc
return result;
}

unsigned GetSingleReturnFieldOffset() const
{
assert(!IsMultiRegRetType());
assert(m_regType[0] != TYP_UNKNOWN);
#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
return m_fieldOffset[0];
#else
return 0;
#endif
}

#if defined(TARGET_RISCV64) || defined(TARGET_LOONGARCH64)
unsigned GetReturnFieldOffset(unsigned index) const
{
assert(m_regType[index] != TYP_UNKNOWN);
return m_fieldOffset[index];
}
#endif

// Get i'th ABI return register
regNumber GetABIReturnReg(unsigned idx, CorInfoCallConvExtension callConv) const;

Expand Down Expand Up @@ -4508,9 +4540,6 @@ struct CallArgABIInformation
: NumRegs(0)
, ByteOffset(0)
, ByteSize(0)
#if defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
, StructFloatFieldType()
#endif
, ArgType(TYP_UNDEF)
, PassedByRef(false)
#if FEATURE_ARG_SPLIT
Expand All @@ -4537,18 +4566,6 @@ struct CallArgABIInformation
unsigned NumRegs;
unsigned ByteOffset;
unsigned ByteSize;
#if defined(UNIX_AMD64_ABI)
// Unix amd64 will split floating point types and integer types in structs
// between floating point and general purpose registers. Keep track of that
// information so we do not need to recompute it later.
SYSTEMV_AMD64_CORINFO_STRUCT_REG_PASSING_DESCRIPTOR StructDesc;
#endif // UNIX_AMD64_ABI
#if defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)
// For LoongArch64's and RISC-V 64's ABI, the struct which has float field(s) and no more than two fields
// may be passed by float register(s).
// e.g `struct {int a; float b;}` passed by an integer register and a float register.
var_types StructFloatFieldType[2];
#endif
// The type used to pass this argument. This is generally the original
// argument type, but when a struct is passed as a scalar type, this is
// that type. Note that if a struct is passed by reference, this will still
Expand Down
41 changes: 19 additions & 22 deletions src/coreclr/jit/lclvars.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -914,40 +914,37 @@ void Compiler::lvaInitUserArgs(InitVarDscInfo* varDscInfo, unsigned skipArgs, un
if ((lowering != nullptr) && !lowering->byIntegerCallConv)
{
assert(varTypeIsStruct(argType));
int floatNum = 0;
assert((lowering->numLoweredElements == 1) || (lowering->numLoweredElements == 2));
if (lowering->numLoweredElements == 1)
{
assert(argSize <= 8);
assert(varDsc->lvExactSize() <= argSize);

floatNum = 1;
canPassArgInRegisters = varDscInfo->canEnreg(TYP_DOUBLE, 1);
cSlotsToEnregister = lowering->numLoweredElements;
argRegTypeInStruct1 = JITtype2varType(lowering->loweredElements[0]);
if (lowering->numLoweredElements == 2)
argRegTypeInStruct2 = JITtype2varType(lowering->loweredElements[1]);

argRegTypeInStruct1 = JITtype2varType(lowering->loweredElements[0]);
assert(varTypeIsFloating(argRegTypeInStruct1));
}
else
int floatNum = (int)varTypeIsFloating(argRegTypeInStruct1) + (int)varTypeIsFloating(argRegTypeInStruct2);
assert(floatNum > 0);

canPassArgInRegisters = varDscInfo->canEnreg(TYP_DOUBLE, floatNum);
if (canPassArgInRegisters && (floatNum < lowering->numLoweredElements))
{
assert(floatNum == 1);
assert(lowering->numLoweredElements == 2);
argRegTypeInStruct1 = genActualType(JITtype2varType(lowering->loweredElements[0]));
argRegTypeInStruct2 = genActualType(JITtype2varType(lowering->loweredElements[1]));
floatNum = (int)varTypeIsFloating(argRegTypeInStruct1) + (int)varTypeIsFloating(argRegTypeInStruct2);
canPassArgInRegisters = varDscInfo->canEnreg(TYP_DOUBLE, floatNum);
if (floatNum == 1)
canPassArgInRegisters = canPassArgInRegisters && varDscInfo->canEnreg(TYP_I_IMPL, 1);
assert(varTypeIsIntegralOrI(argRegTypeInStruct1) || varTypeIsIntegralOrI(argRegTypeInStruct2));
canPassArgInRegisters = varDscInfo->canEnreg(TYP_I_IMPL, 1);
}

assert((floatNum == 1) || (floatNum == 2));

if (!canPassArgInRegisters)
{
// On LoongArch64 and RISCV64, if there aren't any remaining floating-point registers to pass the
// argument, integer registers (if any) are used instead.
canPassArgInRegisters = varDscInfo->canEnreg(argType, cSlotsToEnregister);

// If a struct eligible for passing according to floating-point calling convention cannot be fully
// enregistered, it is passed according to integer calling convention -- in up to two integer registers
// and/or stack slots, as a lump of bits laid out like in memory.
cSlotsToEnregister = cSlots;
argRegTypeInStruct1 = TYP_UNKNOWN;
argRegTypeInStruct2 = TYP_UNKNOWN;

canPassArgInRegisters = varDscInfo->canEnreg(argType, cSlotsToEnregister);
if (cSlotsToEnregister == 2)
{
if (!canPassArgInRegisters && varDscInfo->canEnreg(TYP_I_IMPL, 1))
Expand Down Expand Up @@ -1082,7 +1079,7 @@ void Compiler::lvaInitUserArgs(InitVarDscInfo* varDscInfo, unsigned skipArgs, un
varDsc->SetOtherArgReg(
genMapRegArgNumToRegNum(secondAllocatedRegArgNum, argRegTypeInStruct2, info.compCallConv));
}
else if (cSlots > 1)
else if (cSlotsToEnregister > 1)
{
// Here a struct-arg which needs two registers but only one integer register available,
// it has to be split. But we reserved extra 8-bytes for the whole struct.
Expand Down
8 changes: 6 additions & 2 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4840,12 +4840,14 @@ GenTree* Lowering::LowerStoreLocCommon(GenTreeLclVarCommon* lclStore)
// x86 uses it only for long return type, not for structs.
assert(slotCount == 1);
assert(lclRegType != TYP_UNDEF);
#else // !TARGET_XARCH || UNIX_AMD64_ABI
#else // !TARGET_XARCH || UNIX_AMD64_ABI
if (!comp->IsHfa(layout->GetClassHandle()))
{
if (slotCount > 1)
{
#if !defined(TARGET_RISCV64) && !defined(TARGET_LOONGARCH64)
assert(call->HasMultiRegRetVal());
#endif
}
else
{
Expand Down Expand Up @@ -5151,6 +5153,7 @@ void Lowering::LowerRetSingleRegStructLclVar(GenTreeUnOp* ret)
// Otherwise we don't mind that we leave the upper bits undefined.
lclVar->ChangeType(ret->TypeGet());
}
lclVar->AsLclFld()->SetLclOffs(comp->compRetTypeDesc.GetSingleReturnFieldOffset());
}
else
{
Expand Down Expand Up @@ -5339,7 +5342,8 @@ GenTreeLclVar* Lowering::SpillStructCallResult(GenTreeCall* call) const
comp->lvaSetVarDoNotEnregister(spillNum DEBUGARG(DoNotEnregisterReason::LocalField));
CORINFO_CLASS_HANDLE retClsHnd = call->gtRetClsHnd;
comp->lvaSetStruct(spillNum, retClsHnd, false);
GenTreeLclFld* spill = comp->gtNewStoreLclFldNode(spillNum, call->TypeGet(), 0, call);
unsigned offset = call->GetReturnTypeDesc()->GetSingleReturnFieldOffset();
GenTreeLclFld* spill = comp->gtNewStoreLclFldNode(spillNum, call->TypeGet(), offset, call);

BlockRange().InsertAfter(call, spill);
ContainCheckStoreLoc(spill);
Expand Down
Loading

0 comments on commit bc9cd27

Please sign in to comment.