-
Notifications
You must be signed in to change notification settings - Fork 0
/
gapminder.Rmd
57 lines (46 loc) · 2.09 KB
/
gapminder.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
title: "Effect of Population Density on Life Expectancy"
author: "Gustavo Arruda Franco"
date: "`r lubridate::today()`"
output: github_document
---
## Settings
```{r global_options}
knitr::opts_chunk$set(echo = TRUE, warnings = FALSE, message = FALSE)
library(gapminder)
library(dplyr)
library(geonames)
library(countrycode)
library(ggplot2)
library(ggthemes)
theme_set(theme_fivethirtyeight())
```
## Code
The following code retrieves the relevant dataframes from a R library and an API, then tidies and joins them.
```{r data}
gapminder <- data.frame(gapminder)
key <- getOption("geonamesUsername")
countryInfo <- data.frame(GNcountryInfo())
countryJoin <- countrycode(countryInfo[, 13], origin = 'iso2c', destination = 'country.name')
countryInfo <- mutate(countryInfo, country = countryJoin)
gapminder_density <- left_join(gapminder, countryInfo, by = "country") %>%
mutate(popDensity = pop / as.numeric(areaInSqKm))
```
## Analysis
```{r analysis}
ggplot(gapminder_density, aes(x = popDensity, y = lifeExp)) +
geom_jitter(alpha = 0.1) +
geom_smooth() +
labs(x = "Population density (inhabitants per Sq Km)", y = "Average life expectancy", title = "Life Expectancy by Population Density")
ggplot(gapminder_density, aes(x = popDensity, y = lifeExp)) +
geom_jitter(alpha = 0.1) +
facet_wrap(~ continent.x) +
geom_smooth()
gapminder_density_mod <- lm(lifeExp ~ popDensity, data = gapminder_density)
summary(gapminder_density_mod)
```
From the "Life Expectancy by Population Density" chart and the linear regression model coefficients, we can see there is a small but strong correlation between population density and life expectancy. However, most of the observations are clustered vertically at the beginning of the low density scale. When we probe deeper by faceting the observations by continent, we see that there is a sharp increase, a valley, then another sharp increase in Europe, Americas and Africa. Asia looks different, with no valleys and a much longer tail of increasing population density. Oceania shows no visible correlation.
## Session info
```{r session_info}
devtools::session_info()
```