-
Notifications
You must be signed in to change notification settings - Fork 5k
Tensor.Slice No Longer Copies #113166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensor.Slice No Longer Copies #113166
Conversation
Tagging subscribers to this area: @dotnet/area-system-numerics-tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR updates Tensor slicing behavior so that slices no longer copy the underlying data but instead reference the same backing memory.
- Updated test assertions in TensorTests.cs and TensorSpanTests.cs to reflect that slicing now modifies the original tensor data for the sliced region.
- Modified the Tensor implementation in Tensor.cs to introduce a memoryOffset field and adjust the AsTensorSpan and Slice methods accordingly.
Reviewed Changes
File | Description |
---|---|
src/libraries/System.Numerics.Tensors/tests/TensorTests.cs | Updated assertions and comments to match new non-copy slicing behavior |
src/libraries/System.Numerics.Tensors/tests/TensorSpanTests.cs | Simplified test assertions by removing redundant span creation |
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/Tensor.cs | Introduced memoryOffset and updated AsTensorSpan/Slice to avoid data copying |
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using '_values.Length - _memoryOffset' to determine the flattened length of the slice may not correctly reflect the actual number of elements in the sliced tensor. Consider calculating this value from the sliced tensor's dimensions (e.g., using TensorSpanHelpers.CalculateTotalLength(_lengths)) to ensure it accurately represents the slice's data range.
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset); | |
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, TensorSpanHelpers.CalculateTotalLength(_lengths)); |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of this suggestion, @michaelgsharp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggestion is incorrect sadly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this fixing a reported GitHub issue? Just asking since I don't see one in the description.
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of this suggestion, @michaelgsharp?
@@ -454,26 +459,44 @@ public Tensor<T> this[Tensor<bool> filter] | |||
/// Forms a slice out of the given tensor | |||
/// </summary> | |||
/// <param name="start">The ranges for the slice</param> | |||
/// <returns><see cref="Tensor{T}"/> as a copy of the provided ranges.</returns> | |||
// REVIEW: CURRENTLY DOES A COPY. | |||
/// <returns><see cref="Tensor{T}"/> without copying the provided ranges.</returns> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider this a breaking change?
Also, does tensors use triple slash comments as source of truth? Asking in case it doesn't, in which case you will need to update dotnet-api-docs manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we are using triple slash comments as source of truth.
I'm not sure if its a breaking change, but since the API here is still experimental we are good to make this change.
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/Tensor.cs
Show resolved
Hide resolved
s.CopyTo(outTensor); | ||
return outTensor; | ||
scoped Span<nint> lengths = new nint[Rank]; | ||
scoped Span<nint> offsets = new nint[Rank]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are allocating two arrays here which will be GC'ed after creating the sliced tensor. I suggest doing one of the two things:
-
Create a new internal Tensor constructor that accept lengths as array. In this constructor you will not need to do ToArray() again and will save you the allocation.
-
Or use ArrayPool.Instance.Rent(). you can allocate one array instead of two and slice the array for lengths and offsets.
Either way, I think offsets can be allocated in ArrayPool or avoid allocations and calculate each entry when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We intentionally decided to always do a ToArray in the constructor. But I have fixed it now with either stack alloc or renting the array.
Previously our Tensor.Slice would copy the underlying data. This was undesirable for many reasons. With these changes we no longer perform a copy of the underlying data for a slice.