-
Notifications
You must be signed in to change notification settings - Fork 212
concat dataset #1354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concat dataset #1354
Conversation
| /// </summary> | ||
| public abstract class Dataset : Dataset<Dictionary<string, torch.Tensor>> | ||
| public abstract class Dataset : Dataset<Dictionary<string, Tensor>>, | ||
| IDataset<IReadOnlyDictionary<string, Tensor>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented in the codes below, due to covariation, Dataset should naturally be IDataset<IReadOnlyDictionary<string, Tensor>> (because it is IDataset<Dictionary<string, Tensor>>). However FSharp.Examples cannot be complied without this. I don't know why this happens.
I would suggest to remove this line, because it influences the detection of torch.utils.data.ConcatDataset (IReadOnlyDictionary or Dictionary are both ok, so it can't automatically choose one), but I failed to make the fsharp program work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more radical idea is to remove Dataset and IterableDataset at all. Actually we don't need them. DataLoaders could work on any IDataset<IReadOnlyDictionary<string, torch.Tensor>> and IDataset<IEnumerable<Tensor>>.
By the way, our IterableDataset is occupying the position of PyTorch IterableDataset. #1353 That's also part of the reason why I suggest to remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F# samples have to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I suppose so... However it doesn't https://dev.azure.com/dotnet/TorchSharp/_build/results?buildId=107828&view=results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So. just to check -- with the current commits, the F# examples build, correct? I don't see any errors in the latest build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I could be built by adding : IDataset<IReadOnlyDictionary<string, Tensor>>.
| /// <param name="index">Index for tensor</param> | ||
| /// <returns>Tensors of index. DataLoader will catenate these tensors into batches.</returns> | ||
| [IndexerName("DatasetItems")] | ||
| T this[long index] { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's using indexers instead of GetTensor in the new interface. I'm not really sure about its IndexerName.
And... shall we remove GetTensor at the same time? Currently Dataset<T> is keeping an abstract GetTensor to ensure compatibility. (And its indexer is calling GetTensor.)
|
Ready to merge? |
|
yes |
|
@LittleLittleCloud, @yueyinqiu -- After merge, builds are failing: https://dev.azure.com/dotnet/TorchSharp/_build/results?buildId=107925&view=results |
|
This was reverted due to breaking changes preventing a build. We're eager to make a new release with support for Ubuntu 20.04. |
#1348 (comment)
torch.utils.data.ConcatDatasetDataLoaders has been relaxed toIDataset<out T>(a new covariant interface), thus accepting the concated datasetDataset<T>implementsIDataset<T>Datasetimplements bothIDataset<Dictionary<string, Tensor>>andIDataset<IReadOnlyDictionary<string, Tensor>>IterableDatasetimplementsIDataset<IList<string, Tensor>>andIDataset<IEnumerable<string, Tensor>>IReadOnlyList