Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_mnist_dataset must include train_test_split #101

Open
SaashaJoshi opened this issue May 10, 2024 · 7 comments
Open

load_mnist_dataset must include train_test_split #101

SaashaJoshi opened this issue May 10, 2024 · 7 comments
Labels
enhancement New feature or request hacktoberfest Included in hacktoberfest 2024 help wanted Extra attention is needed research This requires additional research

Comments

@SaashaJoshi
Copy link
Owner

The train_test_split feature is not currently supported within the load_mnist_dataset. The user will have to call the split before calling the wrapper function.

It will be a better design to include the split within the function such that both the train and test datasets or Dataloaders are retrieved, split, and normalized with one function only. Another option could be to let load_mnist_dataset take an argument that mentions if the user is loading the train or test dataloader. To achieve this, logic inside the wrapper function must be split into two sub-functions or conditionals.

Ideas for better design are welcome.

@SaashaJoshi SaashaJoshi added enhancement New feature or request help wanted Extra attention is needed research This requires additional research labels May 10, 2024
@Rajesh-1234567
Copy link

Hey @SaashaJoshi !

You can solve the issue by either:

Integrating train_test_split within the load_mnist_dataset function, returning both train and test datasets after splitting and normalizing.
Adding an argument to specify whether to load the train or test dataset, with the split and normalization handled accordingly.

@SaashaJoshi
Copy link
Owner Author

Hi @Rajesh-1234567,
This is amazing! Feel free to open a PR whenever you are ready or ask any questions by tagging me.

@SaashaJoshi SaashaJoshi added the hacktoberfest Included in hacktoberfest 2024 label Oct 1, 2024
@SaashaJoshi
Copy link
Owner Author

Hi @Rajesh-1234567, any questions or updates?

@sohamyedgaonkar
Copy link

hey @SaashaJoshi i just wanna know is this issue still open and which all files are included(how can i recreate the bug)

@SaashaJoshi
Copy link
Owner Author

Hi @sohamyedgaonkar, Sure go ahead. You can work on this since I haven't received any update on this yet.
This is not a bug per se but the wrapper load_mnist_dataset is a function that handles most of the data cleaning, including normalization and bathing. I was hoping we could also have a train_test_split in the wrapper as well so that a user doesn't have to split the dataset separately. Let me know if you have any questions.

@sohamyedgaonkar
Copy link

hey @SaashaJoshi which files do you want to integrate it in ? (so that i can directly change some lines and pull a request

@SaashaJoshi
Copy link
Owner Author

Hi, @sohamyedgaonkar Sorry for not looking at this message before. I would want you to integrate test_train_split as an argument in the load_mnist_dataset fucntion. I also mentioned this in the PR. Let me know if you would need any more help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request hacktoberfest Included in hacktoberfest 2024 help wanted Extra attention is needed research This requires additional research
Projects
None yet
Development

No branches or pull requests

3 participants