Skip to content

Commit

Permalink
Change the ReciprocalEstimate and ReciprocalSqrtEstimate APIs to be m…
Browse files Browse the repository at this point in the history
…ustExpand on RyuJIT (dotnet#102098)

* Change the ReciprocalEstimate and ReciprocalSqrtEstimate APIs to be mustExpand on RyuJIT

* Apply formatting patch

* Fix the RV64 and LA64 builds

* Mark the ReciprocalEstimate and ReciprocalSqrtEstimate methods as AggressiveOptimization to bypass R2R

* Mark other usages of ReciprocalEstimate and ReciprocalSqrtEstimate in Corelib with AggressiveOptimization

* Mark several non-deterministic APIs as BypassReadyToRun and skip intrinsic expansion in R2R

* Cleanup based on PR recommendations to rely on the runtime rather than attributation of non-deterministic intrinsics

* Adding a regression test ensuring direct and indirect invocation of non-deterministic intrinsic APIs returns the same result

* Add a note about non-deterministic intrinsic expansion to the botr

* Apply formatting patch

* Ensure vector tests are correctly validating against the scalar implementation

* Fix the JIT/SIMD/VectorConvert test and workaround a 32-bit test issue

* Skip a test on Mono due to a known/tracked issue

* Ensure that lowering on Arm64 doesn't make an assumption about cast shapes

* Ensure the tier0opts local is used

* Ensure impEstimateIntrinsic bails out for APIs that need to be implemented as user calls
  • Loading branch information
tannergooding authored and Ruihan-Yin committed May 30, 2024
1 parent 6905408 commit ae11fbb
Show file tree
Hide file tree
Showing 22 changed files with 639 additions and 284 deletions.
8 changes: 8 additions & 0 deletions docs/design/coreclr/botr/vectors-and-intrinsics.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,14 @@ private void SomeVectorizationHelper()
}
```

#### Non-Deterministic Intrinsics in System.Private.Corelib

Some APIs exposed in System.Private.Corelib are intentionally non-deterministic across hardware and instead only ensure determinism within the scope of a single process. To facilitate the support of such APIs, the JIT defines `Compiler::BlockNonDeterministicIntrinsics(bool mustExpand)` which should be used to help block such APIs from expanding in scenarios such as ReadyToRun. Additionally, such APIs should recursively call themselves so that indirect invocation (such as via a delegate, function pointer, reflection, etc) will compute the same result.

An example of such a non-deterministic API is the `ConvertToIntegerNative` APIs exposed on `System.Single` and `System.Double`. These APIs convert from the source value to the target integer type using the fastest mechanism available for the underlying hardware. They exist due to the IEEE 754 specification leaving conversions undefined when the input cannot fit into the output (for example converting `float.MaxValue` to `int`) and thus different hardware having historically provided differing behaviors on these edge cases. They allow developers who do not need to be concerned with edge case handling but where the performance overhead of normalizing results for the default cast operator is too great.

Another example is the various `*Estimate` APIs, such as `float.ReciprocalSqrtEstimate`. These APIs allow a user to likewise opt into a faster result at the cost of some inaccuracy, where the exact inaccuracy encountered depends on the input and the underlying hardware the instruction is executed against.

# Mechanisms in the JIT to generate correct code to handle varied instruction set support

The JIT receives flags which instruct it on what instruction sets are valid to use, and has access to a new jit interface api `notifyInstructionSetUsage(isa, bool supportBehaviorRequired)`.
Expand Down
30 changes: 25 additions & 5 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -4530,7 +4530,7 @@ class Compiler
CORINFO_SIG_INFO* sig,
CorInfoType callJitType,
NamedIntrinsic intrinsicName,
bool tailCall);
bool mustExpand);
GenTree* impMathIntrinsic(CORINFO_METHOD_HANDLE method,
CORINFO_SIG_INFO* sig,
var_types callType,
Expand Down Expand Up @@ -4561,7 +4561,8 @@ class Compiler
GenTree* impPrimitiveNamedIntrinsic(NamedIntrinsic intrinsic,
CORINFO_CLASS_HANDLE clsHnd,
CORINFO_METHOD_HANDLE method,
CORINFO_SIG_INFO* sig);
CORINFO_SIG_INFO* sig,
bool mustExpand);

#ifdef FEATURE_HW_INTRINSICS
GenTree* impHWIntrinsic(NamedIntrinsic intrinsic,
Expand All @@ -4573,7 +4574,8 @@ class Compiler
CORINFO_CLASS_HANDLE clsHnd,
CORINFO_METHOD_HANDLE method,
CORINFO_SIG_INFO* sig,
GenTree* newobjThis);
GenTree* newobjThis,
bool mustExpand);

protected:
bool compSupportsHWIntrinsic(CORINFO_InstructionSet isa);
Expand All @@ -4584,15 +4586,17 @@ class Compiler
var_types retType,
CorInfoType simdBaseJitType,
unsigned simdSize,
GenTree* newobjThis);
GenTree* newobjThis,
bool mustExpand);

GenTree* impSpecialIntrinsic(NamedIntrinsic intrinsic,
CORINFO_CLASS_HANDLE clsHnd,
CORINFO_METHOD_HANDLE method,
CORINFO_SIG_INFO* sig,
CorInfoType simdBaseJitType,
var_types retType,
unsigned simdSize);
unsigned simdSize,
bool mustExpand);

GenTree* getArgForHWIntrinsic(var_types argType,
CORINFO_CLASS_HANDLE argClass,
Expand Down Expand Up @@ -8272,6 +8276,22 @@ class Compiler
return eeGetEEInfo()->targetAbi == abi;
}

bool BlockNonDeterministicIntrinsics(bool mustExpand)
{
// We explicitly block these APIs from being expanded in R2R
// since we know they are non-deterministic across hardware

if (opts.IsReadyToRun() && !IsTargetAbi(CORINFO_NATIVEAOT_ABI))
{
if (mustExpand)
{
implLimitation();
}
return true;
}
return false;
}

#if defined(FEATURE_EH_WINDOWS_X86)
bool eeIsNativeAotAbi;
bool UsesFunclets() const
Expand Down
3 changes: 2 additions & 1 deletion src/coreclr/jit/hwintrinsic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1577,7 +1577,8 @@ GenTree* Compiler::impHWIntrinsic(NamedIntrinsic intrinsic,
}
else
{
retNode = impSpecialIntrinsic(intrinsic, clsHnd, method, sig, simdBaseJitType, nodeRetType, simdSize);
retNode =
impSpecialIntrinsic(intrinsic, clsHnd, method, sig, simdBaseJitType, nodeRetType, simdSize, mustExpand);
}

#if defined(TARGET_ARM64)
Expand Down
52 changes: 43 additions & 9 deletions src/coreclr/jit/hwintrinsicarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,7 @@ GenTree* Compiler::impNonConstFallback(NamedIntrinsic intrinsic, var_types simdT
// sig -- signature of the intrinsic call.
// simdBaseJitType -- generic argument of the intrinsic.
// retType -- return type of the intrinsic.
// mustExpand -- true if the intrinsic must return a GenTree*; otherwise, false
//
// Return Value:
// The GT_HWINTRINSIC node, or nullptr if not a supported intrinsic
Expand All @@ -483,7 +484,8 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
CORINFO_SIG_INFO* sig,
CorInfoType simdBaseJitType,
var_types retType,
unsigned simdSize)
unsigned simdSize,
bool mustExpand)
{
const HWIntrinsicCategory category = HWIntrinsicInfo::lookupCategory(intrinsic);
const int numArgs = sig->numArgs;
Expand Down Expand Up @@ -762,10 +764,18 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
break;
}

case NI_Vector64_ConvertToInt32:
case NI_Vector64_ConvertToInt32Native:
case NI_Vector128_ConvertToInt32:
case NI_Vector128_ConvertToInt32Native:
{
if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}
FALLTHROUGH;
}

case NI_Vector64_ConvertToInt32:
case NI_Vector128_ConvertToInt32:
{
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_FLOAT);
Expand All @@ -775,10 +785,18 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
break;
}

case NI_Vector64_ConvertToInt64:
case NI_Vector64_ConvertToInt64Native:
case NI_Vector128_ConvertToInt64:
case NI_Vector128_ConvertToInt64Native:
{
if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}
FALLTHROUGH;
}

case NI_Vector64_ConvertToInt64:
case NI_Vector128_ConvertToInt64:
{
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_DOUBLE);
Expand All @@ -799,10 +817,18 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
break;
}

case NI_Vector64_ConvertToUInt32:
case NI_Vector64_ConvertToUInt32Native:
case NI_Vector128_ConvertToUInt32:
case NI_Vector128_ConvertToUInt32Native:
{
if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}
FALLTHROUGH;
}

case NI_Vector64_ConvertToUInt32:
case NI_Vector128_ConvertToUInt32:
{
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_FLOAT);
Expand All @@ -812,10 +838,18 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
break;
}

case NI_Vector64_ConvertToUInt64:
case NI_Vector64_ConvertToUInt64Native:
case NI_Vector128_ConvertToUInt64:
case NI_Vector128_ConvertToUInt64Native:
{
if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}
FALLTHROUGH;
}

case NI_Vector64_ConvertToUInt64:
case NI_Vector128_ConvertToUInt64:
{
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_DOUBLE);
Expand Down
42 changes: 33 additions & 9 deletions src/coreclr/jit/hwintrinsicxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -957,6 +957,8 @@ GenTree* Compiler::impNonConstFallback(NamedIntrinsic intrinsic, var_types simdT
// sig -- signature of the intrinsic call.
// simdBaseJitType -- generic argument of the intrinsic.
// retType -- return type of the intrinsic.
// mustExpand -- true if the intrinsic must return a GenTree*; otherwise, false
//
// Return Value:
// the expanded intrinsic.
//
Expand All @@ -966,7 +968,8 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
CORINFO_SIG_INFO* sig,
CorInfoType simdBaseJitType,
var_types retType,
unsigned simdSize)
unsigned simdSize,
bool mustExpand)
{
GenTree* retNode = nullptr;
GenTree* op1 = nullptr;
Expand Down Expand Up @@ -1098,7 +1101,7 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
{
// Vector<T> is TYP_SIMD32, so we should treat this as a call to Vector128.ToVector256
return impSpecialIntrinsic(NI_Vector128_ToVector256, clsHnd, method, sig, simdBaseJitType, retType,
simdSize);
simdSize, mustExpand);
}
else if (vectorTByteLength == XMM_REGSIZE_BYTES)
{
Expand Down Expand Up @@ -1208,7 +1211,7 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
{
// Vector<T> is TYP_SIMD32, so we should treat this as a call to Vector256.GetLower
return impSpecialIntrinsic(NI_Vector256_GetLower, clsHnd, method, sig, simdBaseJitType, retType,
simdSize);
simdSize, mustExpand);
}

default:
Expand Down Expand Up @@ -1248,13 +1251,13 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
if (intrinsic == NI_Vector256_AsVector)
{
return impSpecialIntrinsic(NI_Vector256_GetLower, clsHnd, method, sig, simdBaseJitType, retType,
simdSize);
simdSize, mustExpand);
}
else
{
assert(intrinsic == NI_Vector256_AsVector256);
return impSpecialIntrinsic(NI_Vector128_ToVector256, clsHnd, method, sig, simdBaseJitType,
retType, 16);
retType, 16, mustExpand);
}
}
}
Expand All @@ -1281,13 +1284,13 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
if (intrinsic == NI_Vector512_AsVector)
{
return impSpecialIntrinsic(NI_Vector512_GetLower, clsHnd, method, sig, simdBaseJitType, retType,
simdSize);
simdSize, mustExpand);
}
else
{
assert(intrinsic == NI_Vector512_AsVector512);
return impSpecialIntrinsic(NI_Vector256_ToVector512, clsHnd, method, sig, simdBaseJitType, retType,
32);
32, mustExpand);
}
break;
}
Expand All @@ -1301,13 +1304,13 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
if (intrinsic == NI_Vector512_AsVector)
{
return impSpecialIntrinsic(NI_Vector512_GetLower128, clsHnd, method, sig, simdBaseJitType,
retType, simdSize);
retType, simdSize, mustExpand);
}
else
{
assert(intrinsic == NI_Vector512_AsVector512);
return impSpecialIntrinsic(NI_Vector128_ToVector512, clsHnd, method, sig, simdBaseJitType,
retType, 16);
retType, 16, mustExpand);
}
}
}
Expand Down Expand Up @@ -1436,6 +1439,11 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_FLOAT);

if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}

op1 = impSIMDPopStack();
retNode = gtNewSimdCvtNativeNode(retType, op1, CORINFO_TYPE_INT, simdBaseJitType, simdSize);
break;
Expand Down Expand Up @@ -1463,6 +1471,11 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_DOUBLE);

if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}

if (IsBaselineVector512IsaSupportedOpportunistically())
{
op1 = impSIMDPopStack();
Expand Down Expand Up @@ -1542,6 +1555,11 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_FLOAT);

if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}

if (IsBaselineVector512IsaSupportedOpportunistically())
{
op1 = impSIMDPopStack();
Expand All @@ -1556,6 +1574,7 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
{
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_DOUBLE);

if (IsBaselineVector512IsaSupportedOpportunistically())
{
op1 = impSIMDPopStack();
Expand All @@ -1571,6 +1590,11 @@ GenTree* Compiler::impSpecialIntrinsic(NamedIntrinsic intrinsic,
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_DOUBLE);

if (BlockNonDeterministicIntrinsics(mustExpand))
{
break;
}

if (IsBaselineVector512IsaSupportedOpportunistically())
{
op1 = impSIMDPopStack();
Expand Down
Loading

0 comments on commit ae11fbb

Please sign in to comment.