Skip to content

Adding public API for Pinned Object Heap allocations #33526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
13 commits merged into from
Mar 18, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions src/coreclr/src/System.Private.CoreLib/ILLinkTrim.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@
<!-- Methods are used to register and unregister frozen segments. They are private and experimental. -->
<method name="_RegisterFrozenSegment" />
<method name="_UnregisterFrozenSegment" />
<!-- This is an internal API for now and is not yet used outside tests. -->
<method name="AllocateUninitializedArray" />
</type>
<!-- Properties and methods used by a debugger. -->
<type fullname="System.Threading.Tasks.Task">
Expand Down
81 changes: 66 additions & 15 deletions src/coreclr/src/System.Private.CoreLib/src/System/GC.cs
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,16 @@ public static GCMemoryInfo GetGCMemoryInfo()
[DllImport(RuntimeHelpers.QCall, CharSet = CharSet.Unicode)]
internal static extern int _EndNoGCRegion();

// keep in sync with GC_ALLOC_FLAGS in gcinterface.h
internal enum GC_ALLOC_FLAGS
{
GC_ALLOC_NO_FLAGS = 0,
GC_ALLOC_ZEROING_OPTIONAL = 16,
GC_ALLOC_PINNED_OBJECT_HEAP = 64,
};

[MethodImpl(MethodImplOptions.InternalCall)]
internal static extern Array AllocateNewArray(IntPtr typeHandle, int length, bool zeroingOptional);
internal static extern Array AllocateNewArray(IntPtr typeHandle, int length, GC_ALLOC_FLAGS flags);

[MethodImpl(MethodImplOptions.InternalCall)]
private static extern int GetGenerationWR(IntPtr handle);
Expand Down Expand Up @@ -651,31 +659,74 @@ internal static void UnregisterMemoryLoadChangeNotification(Action notification)
}

/// <summary>
/// Skips zero-initialization of the array if possible.
/// If T contains object references, the array is always zero-initialized.
/// Allocate an array while skipping zero-initialization if possible.
/// </summary>
/// <typeparam name="T">Specifies the type of the array element.</typeparam>
/// <param name="length">Specifies the length of the array.</param>
/// <param name="pinned">Specifies whether the allocated array must be pinned.</param>
/// <remarks>
/// If pinned is set to true, <typeparamref name="T"/> must not be a reference type or a type that contains object references.
/// </remarks>
[MethodImpl(MethodImplOptions.AggressiveInlining)] // forced to ensure no perf drop for small memory buffers (hot path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the additional checks around the pinned argument get removed when the caller uses a const value at the call site / the default argument value, or might the optional parameter end up impacting the perf of existing call sites?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if AggressiveInlining works, then the pinned should be const-propagated and related blocks omitted. I did not check in debugger, but this seems a very easy case.


In reply to: 392758813 [](ancestors = 392758813)

internal static T[] AllocateUninitializedArray<T>(int length)
public static T[] AllocateUninitializedArray<T>(int length, bool pinned = false)
{
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
if (!pinned)
{
return new T[length];
}
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
{
return new T[length];
}

// for debug builds we always want to call AllocateNewArray to detect AllocateNewArray bugs
// for debug builds we always want to call AllocateNewArray to detect AllocateNewArray bugs
#if !DEBUG
// small arrays are allocated using `new[]` as that is generally faster.
if (length < 2048 / Unsafe.SizeOf<T>())
// small arrays are allocated using `new[]` as that is generally faster.
if (length < 2048 / Unsafe.SizeOf<T>())
{
return new T[length];
}
#endif
}
else if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
{
return new T[length];
ThrowHelper.ThrowInvalidTypeWithPointersNotSupported(typeof(T));
}
#endif

// kept outside of the small arrays hot path to have inlining without big size growth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the above else if block then also be moved into the AllocateNewUninitializedArray method, to keep it out of the inlinling path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape of this method was tweaked to make the case when we fallback to T[] fast. That is only relevant if pinned == false. (example: look at use in StringBuilder).
I expect that this entire block will be omitted in such case. That is assuming the method does get inlined, which it should.


In reply to: 392757230 [](ancestors = 392757230)

return AllocateNewUninitializedArray(length);
return AllocateNewUninitializedArray(length, pinned);

// remove the local function when https://github.com/dotnet/coreclr/issues/5329 is implemented
static T[] AllocateNewUninitializedArray(int length)
=> Unsafe.As<T[]>(AllocateNewArray(typeof(T[]).TypeHandle.Value, length, zeroingOptional: true));
static T[] AllocateNewUninitializedArray(int length, bool pinned)
{
GC_ALLOC_FLAGS flags = GC_ALLOC_FLAGS.GC_ALLOC_ZEROING_OPTIONAL;
if (pinned)
flags |= GC_ALLOC_FLAGS.GC_ALLOC_PINNED_OBJECT_HEAP;

return Unsafe.As<T[]>(AllocateNewArray(typeof(T[]).TypeHandle.Value, length, flags));
}
}

/// <summary>
/// Allocate an array.
/// </summary>
/// <typeparam name="T">Specifies the type of the array element.</typeparam>
/// <param name="length">Specifies the length of the array.</param>
/// <param name="pinned">Specifies whether the allocated array must be pinned.</param>
/// <remarks>
/// If pinned is set to true, <typeparamref name="T"/> must not be a reference type or a type that contains object references.
/// </remarks>
public static T[] AllocateArray<T>(int length, bool pinned = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to have this as AllocatePinnedArray<T> only? What is the difference between new object[10] and AllocateArray<object>(10) ?

Copy link
Member Author

@VSadov VSadov Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are plans to add more overloads to this API, possibly to control target generation or alignment.

Indeed, both times when this was reviewed, it was noticed that the trivial case is the same as 'new T[]`, but it seems to be ok and happens with other APIs that have optional parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd to me that every time someone uses this API one has to type the explicit parameter to make the code readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that this API is going to be used very rarely. The more useful API build on top this is going to be a pool for the pinned arrays. The API for that was not designed yet. @VSadov Is it in your plans - what do you expect the pool API to be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments for going it this way were convincing. When there are more arguments, a new overload will be introduced with 2 optional parameters, while this one becomes not optional and so on...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the pool - I think from the usability point, it might be preferable to just add GetPinned to the current pool API.
Alternatively we could have a separate pool for pinned arrays, but then it puts a burden on the user to remember where to return.

{
GC_ALLOC_FLAGS flags = GC_ALLOC_FLAGS.GC_ALLOC_NO_FLAGS;

if (pinned)
{
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
ThrowHelper.ThrowInvalidTypeWithPointersNotSupported(typeof(T));

flags = GC_ALLOC_FLAGS.GC_ALLOC_PINNED_OBJECT_HEAP;
}

return Unsafe.As<T[]>(AllocateNewArray(typeof(T[]).TypeHandle.Value, length, flags));
}
}
}
9 changes: 6 additions & 3 deletions src/coreclr/src/gc/gc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12198,7 +12198,7 @@ void gc_heap::adjust_limit_clr (uint8_t* start, size_t limit_size, size_t size,
uint8_t* obj_start = acontext->alloc_ptr;
assert(start >= obj_start);
uint8_t* obj_end = obj_start + size - plug_skew;
assert(obj_end > clear_start);
assert(obj_end >= clear_start);

// if clearing at the object start, clear the syncblock.
if(obj_start == start)
Expand Down Expand Up @@ -37204,7 +37204,9 @@ GCHeap::Alloc(gc_alloc_context* context, size_t size, uint32_t flags REQD_ALIGN_
#endif //_PREFAST_
#endif //MULTIPLE_HEAPS

if (size >= loh_size_threshold || (flags & GC_ALLOC_LARGE_OBJECT_HEAP))
assert(size < loh_size_threshold || (flags & GC_ALLOC_LARGE_OBJECT_HEAP));

if (flags & GC_ALLOC_USER_OLD_HEAP)
{
// The LOH always guarantees at least 8-byte alignment, regardless of platform. Moreover it doesn't
// support mis-aligned object headers so we can't support biased headers. Luckily for us
Expand All @@ -37213,7 +37215,8 @@ GCHeap::Alloc(gc_alloc_context* context, size_t size, uint32_t flags REQD_ALIGN_
ASSERT((flags & GC_ALLOC_ALIGN8_BIAS) == 0);
ASSERT(65536 < loh_size_threshold);

newAlloc = (Object*) hp->allocate_uoh_object (size + ComputeMaxStructAlignPadLarge(requiredAlignment), flags, loh_generation, acontext->alloc_bytes_uoh);
int gen_num = (flags & GC_ALLOC_PINNED_OBJECT_HEAP) ? poh_generation : loh_generation;
newAlloc = (Object*) hp->allocate_uoh_object (size + ComputeMaxStructAlignPadLarge(requiredAlignment), flags, gen_num, acontext->alloc_bytes_uoh);
ASSERT(((size_t)newAlloc & 7) == 0);

#ifdef FEATURE_STRUCTALIGN
Expand Down
3 changes: 2 additions & 1 deletion src/coreclr/src/gc/gcinterface.h
Original file line number Diff line number Diff line change
Expand Up @@ -890,7 +890,7 @@ void updateGCShadow(Object** ptr, Object* val);
#define GC_CALL_INTERIOR 0x1
#define GC_CALL_PINNED 0x2

//flags for IGCHeapAlloc(...)
// keep in sync with GC_ALLOC_FLAGS in GC.cs
enum GC_ALLOC_FLAGS
{
GC_ALLOC_NO_FLAGS = 0,
Expand All @@ -901,6 +901,7 @@ enum GC_ALLOC_FLAGS
GC_ALLOC_ZEROING_OPTIONAL = 16,
GC_ALLOC_LARGE_OBJECT_HEAP = 32,
GC_ALLOC_PINNED_OBJECT_HEAP = 64,
GC_ALLOC_USER_OLD_HEAP = GC_ALLOC_LARGE_OBJECT_HEAP | GC_ALLOC_PINNED_OBJECT_HEAP,
};

inline GC_ALLOC_FLAGS operator|(GC_ALLOC_FLAGS a, GC_ALLOC_FLAGS b)
Expand Down
7 changes: 5 additions & 2 deletions src/coreclr/src/vm/comutilnative.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1047,7 +1047,7 @@ FCIMPLEND
** zeroingOptional -> whether caller prefers to skip clearing the content of the array, if possible.
**Exceptions: IDS_EE_ARRAY_DIMENSIONS_EXCEEDED when size is too large. OOM if can't allocate.
==============================================================================*/
FCIMPL3(Object*, GCInterface::AllocateNewArray, void* arrayTypeHandle, INT32 length, CLR_BOOL zeroingOptional)
FCIMPL3(Object*, GCInterface::AllocateNewArray, void* arrayTypeHandle, INT32 length, INT32 flags)
{
CONTRACTL {
FCALL_CHECK;
Expand All @@ -1058,7 +1058,10 @@ FCIMPL3(Object*, GCInterface::AllocateNewArray, void* arrayTypeHandle, INT32 len

HELPER_METHOD_FRAME_BEGIN_RET_0();

pRet = AllocateSzArray(arrayType, length, zeroingOptional ? GC_ALLOC_ZEROING_OPTIONAL : GC_ALLOC_NO_FLAGS);
//Only the following flags are used by GC.cs, so we'll just assert it here.
_ASSERTE((flags & ~(GC_ALLOC_ZEROING_OPTIONAL | GC_ALLOC_PINNED_OBJECT_HEAP)) == 0);

pRet = AllocateSzArray(arrayType, length, (GC_ALLOC_FLAGS)flags);

HELPER_METHOD_FRAME_END();

Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/src/vm/comutilnative.h
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ class GCInterface {
static FCDECL0(INT64, GetAllocatedBytesForCurrentThread);
static FCDECL1(INT64, GetTotalAllocatedBytes, CLR_BOOL precise);

static FCDECL3(Object*, AllocateNewArray, void* elementTypeHandle, INT32 length, CLR_BOOL zeroingOptional);
static FCDECL3(Object*, AllocateNewArray, void* elementTypeHandle, INT32 length, INT32 flags);

#ifdef FEATURE_BASICFREEZE
static
Expand Down
Loading