Life Expectancy Analysis On Different Countries

Description

Using the "Life Expectancy (WHO)" dataset, we have generated insights about the correlation between the 22 parameters.

We have also trained some linear regression models to try to predict "Life Expectancy (in ages)".

Dataset Metadata

Country
Year
Status: Developed or Developing status (for each country)
Life Expectancy (in ages)
Adult Mortality: rates of both sexes (probability of dying between 15 and 60 years per 1000 population)(%)
infant deaths: Number of Infant Deaths per 1000 population
Alcohol: recorded per capita (15+) consumption (in litres of pure alcohol)
percentage expenditure: Expenditure on health as a percentage of Gross Domestic Product per capita(%)
Hepatitis B: HepB immunization coverage among 1-year-olds (%)
Measles (sarampo): number of reported cases per 1000 population
BMI: Average Body Mass Index of entire population
under-five deaths: Number of under-five deaths per 1000 population
Polio: Pol3 immunization coverage among 1-year-olds (%)
Total expenditure: General government expenditure on health as a percentage of total government expenditure (%)
Diphtheria: diphtheria tetanus toxoid and pertussis (DTP3) immunization coverage among 1-year-olds (%)
HIV/AIDS: Deaths per 1000 live births HIV/AIDS (0-4 years)
GPD: Gross Domestic Product per capita (in USD)
Population
thinness 1-19 years: Prevalence of thinness among children and adolescents for Age 10 to 19 (%)
thinness 5-9 years: Prevalence of thinness among children for Age 5 to 9(%)
income composition of resources: Human Development Index in terms of income composition of resources (index ranging from 0 to 1)
Schooling (in years)

Getting Started

You can try this code on your own by opening google colab, and chossing "File"> "Open notebook" > "GitHub" and inserting the URL for this project. Then you only need to select the notebook file that is shown there.

When running the code in a notebook environment, two files are generated: "fill_missing_gdp.csv" and "fill_missing_population.csv". You may fill them with real data so that the code use it instead of removind these records.

Tools

Python, Pandas, Data Visualization (Matplotlib, Seaborn), Scikit-learn (for training and evaluation models)

Improvements

Understand why the model is performing so well at the testing data (there may be data leakage)
Input missing data
Search for better ways to treat data

Notes

This project was initially developed during ADA's Data Science Path course - Statistics II module, along with collaborators.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Life Expectancy Data.csv		Life Expectancy Data.csv
LifeExpectancy.ipynb		LifeExpectancy.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Life Expectancy Analysis On Different Countries

Description

Dataset Metadata

Getting Started

Tools

Improvements

Notes

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

SLMath/Life-Expectancy-Analysis

Folders and files

Latest commit

History

Repository files navigation

Life Expectancy Analysis On Different Countries

Description

Dataset Metadata

Getting Started

Tools

Improvements

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages