Open
Description
I discovered this performance issue while using the MNIST dataset.
This only happens with the combination of .NET framework and cpu target. It does not happen with .NET target or cuda.
This is a minimal reproducible source.
// match to MNIST dataset size
var size = 70000;
var tensors = new List<torch.Tensor>(size);
var dev = new torch.Device("cpu");
for (int i = 0; i < size; ++i)
{
tensors.Add(torch.tensor(new[] { 1.0f }, device: dev));
}
Console.WriteLine(tensors.Count);
foreach (var tensor in tensors)
{
tensor.Dispose();
}
tensors.Clear();
Console.WriteLine(tensors.Count);
The profiler indicates ConcurrentDictionary.TryAdd() and ConcurrentDictionary.TryRemove(), but it looks like MulticastDelegate.Equals() is the problem.
For .NET framework and cuda combination, it seems like removed directly on _tensor_generic.