This repository was archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[WIP] Further improve performance of Memory<T>.Span property getter #22829
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
// See the LICENSE file in the project root for more information. | ||
|
||
using System.Runtime.CompilerServices; | ||
using System.Runtime.InteropServices; | ||
|
||
namespace System.Buffers | ||
{ | ||
|
@@ -11,6 +12,19 @@ namespace System.Buffers | |
/// </summary> | ||
public abstract class MemoryManager<T> : IMemoryOwner<T>, IPinnable | ||
{ | ||
#pragma warning disable CS0414 | ||
// It's imperative that this be the first field in the MemoryManager<T> instance | ||
// and that it have a negative value. Because it's the first field in the object, | ||
// it occupies the same position as String.Length and SzArray.Length; but because | ||
// this value is negative and those values are not, we can easily determine whether | ||
// any given Memory<T> is backed by a variable-length object (string, array) or | ||
// a fixed-length object (MemoryManager). | ||
private readonly int _dummy = -1; | ||
#if BIT64 | ||
private readonly int _dummyPadding = 0; | ||
#endif | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rather than an ifdef with padding, could this just be a -1 IntPtr? |
||
#pragma warning restore CS0414 | ||
|
||
/// <summary> | ||
/// Returns a <see cref="System.Memory{T}"/>. | ||
/// </summary> | ||
|
@@ -21,6 +35,24 @@ public abstract class MemoryManager<T> : IMemoryOwner<T>, IPinnable | |
/// </summary> | ||
public abstract Span<T> GetSpan(); | ||
|
||
/// <summary> | ||
/// A special helper method which serves as a stub for the call to GetSpan(). It's | ||
/// non-virtual and forbids inlining to keep the call site as compact as possible. | ||
/// Additionally, since the span is deconstructed rather than returned by value, | ||
/// it means the caller only has to allocate space for a single Int32 on the stack. | ||
/// </summary> | ||
[MethodImpl(MethodImplOptions.NoInlining)] | ||
internal ByReference<T> GetSpanAndDeconstruct(out int spanLength) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not convinced it's ok to penalize the MemoryManager case in this way. |
||
{ | ||
// Returns ByReference<T> because the C# compiler sees the 'out' parameter and | ||
// reasons that returning 'ref T' might lead to a reference escaping a local | ||
// scope. The ByReference<T> return type prevents this false positive error. | ||
|
||
Span<T> span = GetSpan(); | ||
spanLength = span.Length; | ||
return new ByReference<T>(ref MemoryMarshal.GetReference(span)); | ||
} | ||
|
||
/// <summary> | ||
/// Returns a handle to the memory that has been pinned and hence its address can be taken. | ||
/// </summary> | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't the runtime decide to rearrange the fields in this class?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make it a single IntPtr field per your suggestion this should address the rearrangement concern within the base class. The runtime cannot rearrange fields between a base class and its derived types, so we're good there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, this padding is penalizing
MemoryManager
path to make the other path a bit faster. And it is pretty hacky change. I do not think this is a good tradeoff.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had believed the most common usage of
Memory<T>
was to be backed byString
orSzArray
. (In fact, I had believed this to be overwhelmingly so.) If this assumption was invalid, then yeah, we shouldn't take the bulk of this PR.There is still one thing we can take advantage of if we wish: in 32-bit procs,
String
andSzArray
have the exact same layout. This means we could condition theif (typeof(T) == typeof(char))
check on 64-bit only, which should give a perf boost to 32-bit procs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's true for workloads like those in ASP.NET. But I expect that won't be true for certain apps or domains, in particular those that interop with native memory as the lingua franca. The improvements cited for the strings/arrays cases appears to generally be < 10%, whereas the regression for the MemoryManager case looks to be closer to ~40%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three very specific improvements this PR provides when compared to baseline: (a) improving the performance for String/SzArray, (b) reducing the codegen size by around ~20%, and (c) reducing stack spillage in the calling method due to needing to have temporary Span instances on the stack.
The trade-off for gaining (b) and (c) was to introduce the indirection in the MemoryManager code path. (And remember: even with this regression we're still within 5% of 2.1 RTM since we already made substantial improvements in this code path between 2.1 and 3.0 preview earlier.)
It is possible to make this PR so that we give up (b) and (c). This will improve perf for all three cases: String / SzArray / and MemoryManager. The trade-off is that we reduce the codegen size by only an insignificant amount (~3 - 5%), and we'd still have the dummy
-1
field in MemoryManager.If this still isn't an appropriate trade-off, that's fine. We can just close the PR - no harm. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I shouldn't have said "only an insignificant amount (~3 - 5%)" above. Since this property getter is marked with aggressive inlining it's not unreasonable to think it'd be expanded multiple times in the calling method, so even small improvements would add up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @GrabYourPitchforks. If I'm understanding you correctly, you're saying there's a version of this change that improves the throughout for strings and arrays equivalent to this one, improves rather than regresses the throughput for MemoryManager, and still shrinks code size some albeit not as much as in this change? That sounds better to me, personally.