-
Notifications
You must be signed in to change notification settings - Fork 5k
Changed TensorSpan and ReadOnlyTensorSpan layout for better performance. #103244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
aa617ec
c4acef8
c02fd92
fb64487
35dc2cf
5053a67
63bc01d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
|
||
using System.Diagnostics; | ||
using System.Diagnostics.CodeAnalysis; | ||
using System.Runtime.CompilerServices; | ||
using System.Runtime.InteropServices; | ||
|
||
namespace System.Numerics.Tensors | ||
{ | ||
internal readonly struct TensorShape | ||
{ | ||
// Used to determine when we need to allocate a metadata array | ||
public const int MaxInlineArraySize = 5; | ||
|
||
// Used to determine when we can stack alloc for indexing vs when we need to allocate | ||
public const int MaxInlineRank = 8; | ||
|
||
internal readonly nint[]? _metadata; // 8 bytes | ||
|
||
internal readonly nint _memoryLength; // 8 bytes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the layout and size concerns of this change, does the new type require an explicit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This struct contains a managed field ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Does Mono behave the same? I think it might respect Sequential for managed types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That doesn't really matter. RyuJIT doesn't and so code cannot depend on the managed side being sequential, that is part of why we must use This is likewise an internal struct, so it can't be used by developers in interop scenarios. Due to it being a struct containing a managed field, it likewise couldn't be used in interop without marshalling and so there is zero benefit to changing the layout from its default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah but I've meant that Auto could maybe still be beneficial for Mono. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mono should likely consider adjusting their layout algorithm to match the RyuJIT algorithm in that scenario. There are thousands of structs across the BCL and we are not going to go and annotate every single one of them to be explicitly |
||
internal readonly int _rank; // 4 bytes | ||
|
||
private readonly NintBuffer _lengths; | ||
private readonly NintBuffer _strides; | ||
|
||
internal TensorShape(nint memoryLength, ReadOnlySpan<nint> lengths, ReadOnlySpan<nint> strides) | ||
{ | ||
_memoryLength = memoryLength; | ||
_rank = lengths.Length; | ||
if (lengths.Length > MaxInlineArraySize) | ||
{ | ||
_metadata = new nint[lengths.Length + strides.Length]; | ||
lengths.CopyTo(MemoryMarshal.CreateSpan(ref _metadata[0], lengths.Length)); | ||
strides.CopyTo(MemoryMarshal.CreateSpan(ref _metadata[lengths.Length], strides.Length)); | ||
} | ||
else | ||
{ | ||
lengths.CopyTo(_lengths); | ||
strides.CopyTo(_strides); | ||
} | ||
} | ||
|
||
[InlineArray(MaxInlineArraySize)] // 5x8 bytes (40) | ||
private struct NintBuffer | ||
{ | ||
public nint e0; | ||
} | ||
|
||
[UnscopedRef] | ||
public ReadOnlySpan<nint> Lengths => (_metadata is null) | ||
? ((ReadOnlySpan<nint>)_lengths).Slice(0, _rank) | ||
: MemoryMarshal.CreateReadOnlySpan(ref MemoryMarshal.GetArrayDataReference(_metadata), _rank); | ||
|
||
[UnscopedRef] | ||
public ReadOnlySpan<nint> Strides => (_metadata is null) | ||
? ((ReadOnlySpan<nint>)_strides).Slice(0, _rank) | ||
: MemoryMarshal.CreateReadOnlySpan(ref MemoryMarshal.GetArrayDataReference(_metadata), _rank * 2).Slice(_rank); | ||
|
||
public nint FlattenedLength => TensorSpanHelpers.CalculateTotalLength(Lengths); | ||
|
||
public bool IsEmpty => FlattenedLength == 0; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this also be a ref struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to be able to use the same struct as part of
Tensor<T>
so we can avoid additional allocations for common cases there as well.