Exploring World Development Indicators: Identifying relationship between Health Indicators using Linear Regression & Classification of Income Group based on Health Indicators using Logistic Regression.
This project explores the World Development Indicators (WDI) dataset sourced from the World Bank, containing over 1,600 indicators across 217 countries spanning more than 50 years. The project focuses on using machine learning techniques to analyze relationships between health indicators such as infant nutrition and mortality rates.
Project Overview:
- Data Loading and Preprocessing: The dataset was loaded from Kaggle, unpacked, and preprocessed to handle null values.
- Model Selection: Linear Regression was chosen to model the relationship between infant nutrition indicators and infant mortality rates.
- Hyperparameter Tuning: Grid Search and Random Search were employed to optimize model performance, focusing on regularization parameters.
- Ethical and Professional Considerations: Issues such as data reliability, interpretational challenges, fairness, ethical concerns regarding infant mortality analysis, and privacy issues were addressed using transparency, non-linear modeling suggestions, and adherence to data protection laws.
Conclusion: The project emphasizes the importance of ethical guidelines, transparency, and rigorous validation in utilizing WDI datasets for meaningful insights.