Skip to content

Tensor.Slice No Longer Copies #113166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 14, 2025

Conversation

michaelgsharp
Copy link
Member

Previously our Tensor.Slice would copy the underlying data. This was undesirable for many reasons. With these changes we no longer perform a copy of the underlying data for a slice.

@michaelgsharp michaelgsharp self-assigned this Mar 5, 2025
@Copilot Copilot AI review requested due to automatic review settings March 5, 2025 17:34
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR updates Tensor slicing behavior so that slices no longer copy the underlying data but instead reference the same backing memory.

  • Updated test assertions in TensorTests.cs and TensorSpanTests.cs to reflect that slicing now modifies the original tensor data for the sliced region.
  • Modified the Tensor implementation in Tensor.cs to introduce a memoryOffset field and adjust the AsTensorSpan and Slice methods accordingly.

Reviewed Changes

File Description
src/libraries/System.Numerics.Tensors/tests/TensorTests.cs Updated assertions and comments to match new non-copy slicing behavior
src/libraries/System.Numerics.Tensors/tests/TensorSpanTests.cs Simplified test assertions by removing redundant span creation
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/Tensor.cs Introduced memoryOffset and updated AsTensorSpan/Slice to avoid data copying

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment on lines +394 to 395
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset);

Copy link
Preview

Copilot AI Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using '_values.Length - _memoryOffset' to determine the flattened length of the slice may not correctly reflect the actual number of elements in the sliced tensor. Consider calculating this value from the sliced tensor's dimensions (e.g., using TensorSpanHelpers.CalculateTotalLength(_lengths)) to ensure it accurately represents the slice's data range.

Suggested change
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset);
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, TensorSpanHelpers.CalculateTotalLength(_lengths));

Copilot uses AI. Check for mistakes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this suggestion, @michaelgsharp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is incorrect sadly.

Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fixing a reported GitHub issue? Just asking since I don't see one in the description.

Comment on lines +394 to 395
public TensorSpan<T> AsTensorSpan() => new TensorSpan<T>(ref Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(_values), _memoryOffset), _lengths, _strides, _values.Length - _memoryOffset);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this suggestion, @michaelgsharp?

@@ -454,26 +459,44 @@ public Tensor<T> this[Tensor<bool> filter]
/// Forms a slice out of the given tensor
/// </summary>
/// <param name="start">The ranges for the slice</param>
/// <returns><see cref="Tensor{T}"/> as a copy of the provided ranges.</returns>
// REVIEW: CURRENTLY DOES A COPY.
/// <returns><see cref="Tensor{T}"/> without copying the provided ranges.</returns>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider this a breaking change?

Also, does tensors use triple slash comments as source of truth? Asking in case it doesn't, in which case you will need to update dotnet-api-docs manually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are using triple slash comments as source of truth.

I'm not sure if its a breaking change, but since the API here is still experimental we are good to make this change.

s.CopyTo(outTensor);
return outTensor;
scoped Span<nint> lengths = new nint[Rank];
scoped Span<nint> offsets = new nint[Rank];
Copy link
Member

@tarekgh tarekgh Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are allocating two arrays here which will be GC'ed after creating the sliced tensor. I suggest doing one of the two things:

  • Create a new internal Tensor constructor that accept lengths as array. In this constructor you will not need to do ToArray() again and will save you the allocation.

  • Or use ArrayPool.Instance.Rent(). you can allocate one array instead of two and slice the array for lengths and offsets.

Either way, I think offsets can be allocated in ArrayPool or avoid allocations and calculate each entry when needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intentionally decided to always do a ToArray in the constructor. But I have fixed it now with either stack alloc or renting the array.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants