-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
@sivarv @CarolEidt @pgavlin PTAL |
c419a84
to
2845988
Compare
2845988
to
d783a6c
Compare
src/jit/lsra.cpp
Outdated
@@ -8616,7 +8616,15 @@ void LinearScan::resolveRegisters() | |||
printf("-----------------------\n"); | |||
|
|||
printf("Resolution Candidates: "); | |||
dumpConvertedVarSet(compiler, resolutionCandidateVars); | |||
if (compiler->lvaCount == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed after #8840?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume not. I'll have to pull and rebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be needed.
c15918e
to
6216fc4
Compare
e9a7e57
to
817ec7f
Compare
This is ready to be reviewed again. I've run 44 stress flavors of the SIMD tests locally and those are clean. @sivarv @CarolEidt PTAL |
src/jit/gentree.cpp
Outdated
else if ((op1->OperGet() == GT_OBJ) && (op1->gtOp.gtOp1->OperGet() == GT_ADDR) && | ||
op1->gtOp.gtOp1->gtOp.gtOp1->OperIsLocal()) | ||
{ | ||
setLclRelatedToSIMDIntrinsic(op1->gtOp.gtOp1->gtOp.gtOp1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this logic is repeated 3 times. How about abstracting it in a routine SetLclRelatedToSMDIntrinsic()?
The caller would have to call
SetLclRelatedToSMDIntrinsic(op1);
if (op2 != nullptr)
{
SetLclRelatedToSMDIntrinsic(op2);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/jit/gentree.cpp
Outdated
@@ -16885,21 +16884,28 @@ GenTreeSIMD* Compiler::gtNewSIMDNode( | |||
GenTreeSIMD* Compiler::gtNewSIMDNode( | |||
var_types type, GenTreePtr op1, GenTreePtr op2, SIMDIntrinsicID simdIntrinsicID, var_types baseType, unsigned size) | |||
{ | |||
// TODO-CQ: An operand may be a GT_OBJ(GT_ADDR(GT_LCL_VAR))), in which case it should be | |||
// marked lvUsedInSIMDIntrinsic. | |||
assert(op1 != nullptr); | |||
if (op1->OperIsLocal()) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 16871 we are checking whether op1->OperGet() == GT_LCL_VAR whereas here using OperIsLocal(). I think on line 16871 too we should use op1->OperIsLocal()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really want to touch that if there was a reason, but with the refactoring you suggested above, I did it.
src/jit/gschecks.cpp
Outdated
lvaTable[shadowVar].lvRegStruct = varDsc->lvRegStruct; | ||
lvaTable[shadowVar].lvExactSize = varDsc->lvExactSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lvaTable[shadowVar].lvExactSize = varDsc->lvExactSize; [](start = 7, length = 55)
This is already under above under #ifdef FEATURE_SIMD.
In fact initializing lvExactSize for shadow vars other than those known to be SIMD has caused a gc-hole.
Why do we need this change, btw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Pat added the line you refer to long after I added mine (when I started working on this), so it's probably not needed now.
src/jit/lowerxarch.cpp
Outdated
info->internalFloatCount = 1; | ||
info->setInternalCandidates(lsra, lsra->allSIMDRegs()); | ||
info->isInternalRegDelayFree = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else { info->srcCount = 0 } ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/jit/lowerxarch.cpp
Outdated
else if (tree->OperGet() == GT_SIMD) | ||
{ | ||
GenTreeSIMD* simdNode = tree->AsSIMD(); | ||
var_types baseType = simdNode->gtSIMDBaseType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var_types baseType = simdNode->gtSIMDBaseType; [](start = 8, length = 49)
Minor: this variable definition could be moved closer to the point of its use in the block starting at line 4518
src/jit/lowerxarch.cpp
Outdated
{ | ||
assert(baseSize == 2); | ||
ZeroOrSignExtnReqd = (baseType == TYP_SHORT); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a good idea to add comments to explain the rationale behind this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/jit/decomposelongs.cpp
Outdated
case SIMDIntrinsicInit: | ||
case SIMDIntrinsicInitN: | ||
case SIMDIntrinsicInitArray: | ||
NYI("SIMDIntrinsicInit*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NYI("SIMDIntrinsicInit*"); [](start = 12, length = 26)
You might want to file a tracking git issue to address this NYI later
src/jit/decomposelongs.cpp
Outdated
// return: GT_LONG(lowResult, highResult) | ||
// | ||
// This isn't optimal codegen, since SIMDIntrinsicGetItem sometimes requires | ||
// temps that could be shared, for example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is incomplete?
LIR::Use op2(Range(), &simdTree->gtOp.gtOp2, simdTree); | ||
unsigned indexTmpVarNum = op2.ReplaceWithLclVar(m_compiler, m_blockWeight); | ||
JITDUMP("[DecomposeSimdGetItem]: Saving op2 tree to a temp var:\n"); | ||
DISPTREERANGE(Range(), op2.Def()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the index is known to be a constant, we can avoid introducing temps.
If we intend to address it in a separate PR, please file a tracking issue and a TODO comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some (minor, optional) suggestions.
@@ -562,13 +562,13 @@ CONFIG_DWORD_INFO_EX(INTERNAL_JitLoopHoistStats, W("JitLoopHoistStats"), 0, "Dis | |||
CONFIG_DWORD_INFO_EX(INTERNAL_JitDebugLogLoopCloning, W("JitDebugLogLoopCloning"), 0, "In debug builds log places where loop cloning optimizations are performed on the fast path.", CLRConfig::REGUTIL_default); | |||
CONFIG_DWORD_INFO_EX(INTERNAL_JitVNMapSelLimit, W("JitVNMapSelLimit"), 0, "If non-zero, assert if # of VNF_MapSelect applications considered reaches this", CLRConfig::REGUTIL_default) | |||
RETAIL_CONFIG_DWORD_INFO(INTERNAL_JitVNMapSelBudget, W("JitVNMapSelBudget"), 100, "Max # of MapSelect's considered for a particular top-level invocation.") | |||
#if defined(_TARGET_AMD64_) | |||
#if defined(_TARGET_AMD64_) || defined(_TARGET_X86_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't we use TARGET_XARCH here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JIT defines _TARGET_XARCH_
but the VM does not.
src/jit/decomposelongs.cpp
Outdated
case SIMDIntrinsicInit: | ||
case SIMDIntrinsicInitN: | ||
case SIMDIntrinsicInitArray: | ||
NYI("SIMDIntrinsicInit*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be a temporary NYI? Perhaps you could open an issue for it (if you haven't already).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it appears that Init is handled without decomposition. Shouldn't this just default to the noway_assert
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the NYI. At first I thought these would need decomp, but they don't return LONG values, so don't need it.
src/jit/decomposelongs.cpp
Outdated
// | ||
// This isn't optimal codegen, since SIMDIntrinsicGetItem sometimes requires | ||
// temps that could be shared, for example. | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possible issue to open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I could open a general, "evaluate x86 simd CQ" item, and/or one specific to long get[i].
src/jit/decomposelongs.cpp
Outdated
|
||
GenTree* simdTmpVar1 = m_compiler->gtNewLclLNode(simdTmpVarNum, simdTree->gtOp.gtOp1->gtType); | ||
GenTree* indexTmpVar1 = m_compiler->gtNewLclLNode(indexTmpVarNum, TYP_INT); | ||
GenTree* two1 = m_compiler->gtNewIconNode(2, TYP_INT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion: A comment here would be nice. It took me a moment to recall why we would need a constant 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -26,7 +26,7 @@ public static bool CheckValue<T>(T value, T expectedValue) | |||
} | |||
if (returnVal == false) | |||
{ | |||
Console.WriteLine("CheckValue failed for " + expectedValue + " of type " + typeof(T).ToString()); | |||
Console.WriteLine("CheckValue failed for type " + typeof(T).ToString() + ". Expected: {0} (0x{0:X}), Got: {1} (0x{1:X})", expectedValue, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. thanks.
817ec7f
to
b38b5ec
Compare
This change implements support for Vector<long>, handling SIMDIntrinsicInit, which takes a LONG, and decomposition of SIMDIntrinsicGetItem, which produces a LONG. It also enables SIMD, including AVX, by default for RyuJIT/x86.
b38b5ec
to
cacb796
Compare
@dotnet-bot test Windows_NT x86 Checked jitstress1 |
Enable SIMD for RyuJIT/x86 Commit migrated from dotnet/coreclr@d05009a
This change implements support for
Vector<long>
, handlingSIMDIntrinsicInit, which takes a LONG, and decomposition of
SIMDIntrinsicGetItem, which produces a LONG.
It also enables SIMD, including AVX, by default for RyuJIT/x86.