-
Notifications
You must be signed in to change notification settings - Fork 29
New datasets #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New datasets #96
Conversation
Thanks for submitting this PR! It looks great, I think having all these datasets as part of the library is a great addition and from here it should not be too hard to add more of them. Great work! |
…torchhd into New_datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good, exactly what I had in mind. I just added some minor code organization and refactoring comments. Which is mostly about trying to isolate the common behavior.
Also, can you remove the .DS_Store
file from the PR? and make sure to add Adult
also to the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should not have been deleted - I did that accidentally while trying to revert its inclusion into the PR.
Not sure at the moment how to revert this deletion.
I am resolving some minor outstanding issues and will push my changes soon. Small question, is the number of folds always 4 or is it dataset dependent? |
Yes, for datasets in the collection the number of folds is always 4. |
@denkle could you review my refactoring of the |
@mikeheddes, great revision of the code! The logic is more streamlined in multiple places! I do not see any problems with _load_data methods so assume it is good to go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go, I believe
The first attempt to start adding datasets from a collection used within “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?”
The file for the first dataset is one of the most important ones because other files from the collection will pretty much follow what is specified in this file.