Open
Description
PyTorch includes CUDA streams, which let multiple GPU requests run in parallel.
However it appears that TorchSharp does not support CUDA streams. I searched the codebase and can't find anything like PyTorch's torch.cuda.Stream class, or C# wrappers for e.g. the wait_stream(), default_stream() and record_stream() methods.