Skip to content

Add support for >2GB tensors via byte stream #1234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 14, 2024

Conversation

shaltielshmid
Copy link
Contributor

@shaltielshmid shaltielshmid commented Feb 11, 2024

No description provided.


if (!is_contiguous()) throw new InvalidOperationException("SetBytes() called on non-contiguous tensor.");

unsafe {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for CPU here?
We check in the bytes getter, but not the setter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's going to blow up if it's not CPU, then it's better to check and throw an exception with good, specific information about the problem rather than some generic I/O exception later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add

@@ -58,6 +59,7 @@ public static void write_file(string filename, Tensor data)
/// <param name="data">One dimensional <c>uint8</c> <cref>Tensor</cref>.</param>
public static async void write_file_async(string filename, Tensor data)
{
// Currently limited to 2GB - should we duplicate the code for WriteBytesToStream using async ops?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should be deleted before merge - wrote it for @NiklasGustafsson

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we write an async function?

@@ -8269,5 +8269,111 @@ Tensor nbins_ratio(int seed, int size)
}
}
}


[Fact(Skip = "Very heavy on the compute")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are very heavy on the compute, so I put them in Skip (but I tested them myself)

/// </summary>
/// <param name="stream">Stream to write the bytes to</param>
/// <param name="bufferSize">The buffer size to use when writing to the stream</param>
public void WriteBytesToStream(Stream stream, int bufferSize = 65_536)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the Stream.Write overload which supports Spans was added in a later version of .NET, so we can't use it and therefore resort to using a buffer.

@shaltielshmid
Copy link
Contributor Author

Tagging the discussion #1219

@shaltielshmid
Copy link
Contributor Author

shaltielshmid commented Feb 11, 2024

@NiklasGustafsson Please don't merge this in just yet - I want to confirm that it works well with PyBridge, and to benchmark how long the load and save takes before and after.

@shaltielshmid
Copy link
Contributor Author

Benchmarking:

Comparing loading from and saving to a pre-allocated memory (to mitigate the disk speed factor). The model is around 11MB in size when saved to disk. The comparison is between the "old" (current) method and with the new proposed method, with the various buffer sizes (4096, 2048, 1024):

Method Mean Error StdDev Allocated
SaveOld 7.760 ms 0.0828 ms 0.0734 ms 22.04 KB
LoadOld 9.635 ms 0.1364 ms 0.1276 ms 10929.37 KB
Save4K 8.016 ms 0.0770 ms 0.0720 ms 62.27 KB
Load4K 8.099 ms 0.0855 ms 0.0758 ms 62.75 KB
Save2K 7.882 ms 0.1496 ms 0.1470 ms 42.26 KB
Load2K 7.935 ms 0.1165 ms 0.1090 ms 42.75 KB
Save1K 7.867 ms 0.1156 ms 0.1081 ms 32.27 KB
Load1K 7.934 ms 0.1237 ms 0.1157 ms 32.75 KB

@NiklasGustafsson
Copy link
Contributor

Benchmarking:

Comparing loading from and saving to a pre-allocated memory (to mitigate the disk speed factor). The model is around 11MB in size when saved to disk. The comparison is between the "old" (current) method and with the new proposed method, with the various buffer sizes (4096, 2048, 1024):

So, roughly the same load speed, slightly better save speed.

@NiklasGustafsson
Copy link
Contributor

@shaltielshmid -- you asked me to hold off on merging, so I haven't...

@shaltielshmid
Copy link
Contributor Author

shaltielshmid commented Feb 13, 2024

@shaltielshmid -- you asked me to hold off on merging, so I haven't...

Update: Good to go on the merging as soon as we confirm this question below and I update the comment.

Regarding the write_file_async method in TorchVision. Do we want to duplicate the WriteBytesToStream function with async operations?

@shaltielshmid
Copy link
Contributor Author

@NiklasGustafsson Just confirming you saw the previous message. I'm good to go with the merge as soon as we decide if to add an "async" version of the function.

@NiklasGustafsson
Copy link
Contributor

I saw it, but failed to recognize that it was holding up the PR. Yes, async would be doo for reading and writing.

@shaltielshmid
Copy link
Contributor Author

Okay so it's not super trivial - Spans aren't allowed in async methods, so this would require some playing around to get right. I think we should go ahead and add it to the backlog, unless you think it's important in which case I'll work on it on the next few days.

@NiklasGustafsson
Copy link
Contributor

Add it to the backlog.

@shaltielshmid
Copy link
Contributor Author

Add it to the backlog.

Sounds good! The PR is then ready to merge from my perspective. I removed the comment already.

@NiklasGustafsson NiklasGustafsson merged commit 53d74a6 into dotnet:main Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants