Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling out for Data science people!💻📈 #162

Open
pratik-choudhari opened this issue Oct 5, 2020 · 21 comments
Open

Calling out for Data science people!💻📈 #162

pratik-choudhari opened this issue Oct 5, 2020 · 21 comments
Labels
data-science enhancement New feature or request hacktoberfest Contribution for hacktoberfest

Comments

@pratik-choudhari
Copy link
Owner

Current situation

As of now this repository has been accepting contributions for solving coding problems. Which is going pretty good.

So now what?

Now we are accepting contributions for some beautiful and insightful EDAs on dataset of your choice. Yes! a dataset of your choice.
The rules for EDA contribution:

  • Jupyter notebook must go inside EDA/<your-dataset-name>
  • It is possible to 2 people make PRs simultaneously, to avoid such clutter First check if the dataset directory exists, else post a comment with dataset name on this issue
  • Mandatory to add a README or append to existing README
  • If a PR is made without abiding by these rules, PR will be marked spam.

You may leave any queries below

@pratik-choudhari pratik-choudhari added enhancement New feature or request hacktoberfest Contribution for hacktoberfest data-science labels Oct 5, 2020
@Mrsterius
Copy link
Contributor

Hi Pratik,
I would like to add the EDA for Boston Housing dataset. Can I work on this?

@pratik-choudhari
Copy link
Owner Author

@Mrsterius Yes

@Mrsterius
Copy link
Contributor

Mrsterius commented Oct 7, 2020

Hey Pratik,
I have created EDA for the AMES Advanced house prices (regression) dataset since column names were not mentioned for the Boston one. What all do you want me to add in the README? Do I need to upload the dataset as well or I can just give it's link in README?

@Mrsterius
Copy link
Contributor

Submitted. Please let me know if there any changes to be made.

@PrattJena
Copy link
Contributor

Hey Pratik
I would like to add an EDA of dogs vs cats. Can i work on it?

@pratik-choudhari
Copy link
Owner Author

@PrattJena What in cats and dogs? The image dataset right?

@PrattJena
Copy link
Contributor

@pratik-choudhari Yes the image classification dataset

@pratik-choudhari
Copy link
Owner Author

pratik-choudhari commented Oct 10, 2020

@PrattJena Could you elaborate on what will be in the EDA?

@PrattJena
Copy link
Contributor

Actually now that I think about it. Its much better suited for projects folder as the Neural network determines whether the image is of cat or a dog

@pratik-choudhari
Copy link
Owner Author

Yep so that doesn't count as EDA.

@rajpratyush
Copy link
Contributor

EDA for Fashion MNIST dataset and CIFAR 10 data set

@pratik-choudhari
Copy link
Owner Author

@rajpratyush What will be the contents in your EDA?

@rajpratyush
Copy link
Contributor

I have done several assignments on these two datasets while learning through an MOOC on Coursera Platform. I would like to share those.

@pratik-choudhari
Copy link
Owner Author

@rajpratyush EDA on a image dataset, won't it be vague? On a dataset like CIFAR 10 the maximum one could get from EDA is the number of images in every category, correct me if there is any addition to this.

@rajpratyush
Copy link
Contributor

I agree then how about an EDA ON a data set of toxic words i remeber there was a kaggle competition regarding this haad a successful contribution in this

@pratik-choudhari
Copy link
Owner Author

@rajpratyush Yes that will suffice.

@rajpratyush
Copy link
Contributor

@pratik-choudhari btw you have assigned me this issue yet #94 (comment)

@pratik-choudhari
Copy link
Owner Author

@drashtipatel2503 Sure! Put a link and name on data set on issue #161

@sanskritip
Copy link
Contributor

Hey Pratik! Can I add EDA for StackOverflow Developer 2019 Dataset? I will be adding the CSV files, Data Cleaning Jupyter Notebook and additionally the Visualisation Notebook.

@pratik-choudhari
Copy link
Owner Author

@sanskritip Sure, don't include the CSV files it will increase the repo size instead put the link of dataset.

@sanskritip
Copy link
Contributor

@sanskritip Sure, don't include the CSV files it will increase the repo size instead put the link of dataset.

Awesome Sounds great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-science enhancement New feature or request hacktoberfest Contribution for hacktoberfest
Projects
None yet
Development

No branches or pull requests

5 participants