A fact is a simple statement that everyone believes. It’s innocent, unless found guilty. A Hypothesis is a novel suggestion that no one wants to believe. It’s guilty, until found effective. – Edward Teller
We are grappling with a pandemic that’s operating at a never-before-seen scale. Researchers all over the globe are frantically trying to develop a vaccine or a cure for COVID-19 while doctors are just about keeping the pandemic from overwhelming the entire world.
I recently had an idea to apply my statistical knowledge to the trove of COVID-19 data from Ministry of Home Affairs, India.
In this project, I’ll introduce you to the ANOVA test and its different types that are being used to make better decisions. The icing on the cake? I’ll demonstrate each type of ANOVA test in Python to visualize how they work on COVID-19 Data. So let’s get going!
- Data : This directory has input files that you'll need to important orginal Data. You can import this data from Kaggle contest as well. However, I'll encourage you to explore Ministry of Home Affairs, India website, considering the data is coming from there only.
- Processed Data: This directory has data that has been produced during the experiment. You can run the attached python notebook for better understanding.
- One Way ANOVA Test.ipynb: This will demonstrate One-Way ANOVA test on real time COVID-19 Data.
- Two Way ANOVA Test.ipynb: This will demonstrate Two-Way ANOVA test on real time COVID-19 Data.
Download Dataset from the above Link and store the files in the data folder. Run any of the Python notebook according to the task desired
Dependencies:
- pandas: 1.0.1
- numpy: 1.18.1
- scipy: 1.4.1
- statsmodels: 0.11.0
- matplotlib: 3.1.3
- seaborn: 0.10.0
- I've published a comprehensive case study on ANOVA Test using real time COVID-19 data. You can refer this link to get more details.
- What is ANOVA-Test
- Assumptions of ANOVA-Test
- Types of ANOVA-Test
- What are Post-hoc tests; Tukey HSD test
- Implementation of One-Way, and Two-Way ANOVA-Tests using statsmodels, and OLS Models.
This project is open-source and distributed under the MIT License. Feel free to use and modify the code as needed.
If you encounter any issues or have suggestions for improvement, please open an issue in the Issues section of this repository.
The code has been tested on Windows system. It should work well on other distributions but has not yet been tested. In case of any issue with installation or otherwise, please contact me on Linkedin
I’m a seasoned Data Scientist and founder of TowardsMachineLearning.Org. I've worked on various Machine Learning, NLP, and cutting-edge deep learning frameworks to solve numerous business problems.