-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
179 lines (141 loc) · 6.68 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "90%"
)
```
# bcpss
<!-- badges: start -->
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Project Status: Active -- The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
<!-- badges: end -->
The goal of bcpss is to make data from the Baltimore City Public School system more consistent and accessible to R users. This package may pair well with the [mapbaltimore package](https://github.com/elipousson/mapbaltimore) that offers a broader range of Baltimore-specific datasets and functions for working with that data.
## Installation
You can install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("remotes")
remotes::install_github("elipousson/bcpss")
```
## Example
```{r setup}
library(bcpss)
library(tidyverse)
theme_set(theme_minimal())
```
Currently, this package includes datasets that include school and grade-level enrollment and demographic data, the published results from a parent survey, and the published results from a combined student and educator survey completed in 2019. This data can be used to answer questions, such as, what are the elementary schools with the greatest total student enrollment?
```{r enrollment_demographics}
top_5_es <- enrollment_demographics_SY1920 |>
filter(
grade_range == "All Grades",
grade_band == "E"
) |>
select(school_number, school_name, total_enrollment) |>
top_n(5, total_enrollment) |>
arrange(desc(total_enrollment))
top_5_es_caption <- "Five largest BCPSS elementary schools by total enrollment"
knitr::kable(top_5_es, caption = top_5_es_caption)
```
Both the enrollment/demographic data and the parent survey are available in both a wide and long format.
The package also includes spatial data for elementary school attendance zones and program locations for the 2020-2021 school year.
```{r bcps_es_zones}
bcps_es_zones_SY2021 |>
ggplot() +
geom_sf(aes(fill = zone_name)) +
scale_fill_viridis_d() +
guides(fill = "none") +
labs(title = "BCPSS Elementary School Attendance Zones")
```
These two sources can be used in combinations by joining the `program_number` in the spatial data with the equivalent `school_number` used in the survey and demographic data.
```{r top_5_es_map}
top_5_es_map <- bcps_programs_SY2021 |>
left_join(top_5_es, by = c("program_number" = "school_number")) |>
filter(!is.na(total_enrollment)) |>
ggplot() +
geom_sf(data = bcps_es_zones_SY2021, fill = NA, color = "darkblue") +
geom_sf(aes(color = school_name)) +
geom_sf_label(aes(label = program_name_short, fill = school_name), color = "white") +
scale_fill_viridis_d(end = 0.85) +
guides(fill = "none", color = "none") +
labs(title = top_5_es_caption)
top_5_es_map
```
The `bcpss_enrollment` data is a subset of the statewide data available through the `{marylandedu}` package (a tidied version of data downloads available from the Maryland State Department of Education).
Using the `marylandedu::md_nces_directory` data, you can summarise enrollment by year and grade span:
```{r}
baltimore_nces_directory <- marylandedu::md_nces_directory |>
select(year, lss_name, school_number, grade_span)
bcpss_enrollment_summary <- bcpss_enrollment |>
dplyr::filter(
school_number != 0,
race == "All",
grade_range == "All Grades"
) |>
left_join(
baltimore_nces_directory,
by = join_by(lss_name, year, school_number)
) |>
summarise(
n_schools = n_distinct(school_number),
enrolled_count_mean = mean(enrolled_count, na.rm = TRUE),
enrolled_count_total = sum(enrolled_count, na.rm = TRUE),
.by = c(lss_name, year, grade_span)
) |>
filter(
# Exclude missing and uncommon grade span values
!is.na(grade_span),
!(grade_span %in% c("EMH", "MH"))
)
```
The summary data can be plottted:
```{r}
# Create a convenience plotting function
bcpss_enrollment_summary_plot <- function(data = NULL, mapping = aes(), ...) {
ggplot(data = data, mapping = mapping) +
geom_point() +
geom_line() +
labs(
x = "Year",
...,
color = "Grade span",
caption = paste0(
"Note that schools with ",
knitr::combine_words(c("missing", "EMH", "MH")),
" grade spans are excluded.\nData: Maryland State Department of Education."
)
) +
scale_y_continuous(labels = scales::label_number()) +
scale_color_viridis_d(end = 0.85)
}
bcpss_enrollment_summary |>
bcpss_enrollment_summary_plot(
aes(x = year, y = enrolled_count_mean, color = grade_span),
y = "Average enrollment",
title = "Average Baltimore City public school enrollment by grade span, 2003-2023"
)
```
Note that this summary is incomplete without the accompanying total enrollment count showing the shift from elementary to elementary middle schools
```{r}
bcpss_enrollment_summary |>
bcpss_enrollment_summary_plot(
aes(x = year, y = enrolled_count_total, color = grade_span),
y = "Total enrollment",
title = "Total Baltimore City public school enrollment by grade span, 2003-2023"
)
```
## Related projects
### U.S. Education data
- [educationdata](https://github.com/UrbanInstitute/education-data-package-r): Retrieve data from the Urban Institute\'s [Education Data API](https://educationdata.urban.org/) as a `data.frame` for easy analysis.
- [EdSurvey](https://www.air.org/project/nces-data-r-project-edsurvey): EdSurvey is an R statistical package designed for the analysis of national and international education data from the National Center for Education Statistics (NCES).
- [edbuildr](https://github.com/EdBuild/edbuildr): Import EdBuild's master dataset of school district finance, student demographics, and community economic indicators for every school district in the United States.
- [Elementary School Operating Status + NCES 2019-2020 School District Boundaries](https://github.com/hrbrmstr/2021-esos-nces)
### Baltimore City data
- [mapbaltimore](https://elipousson.github.io/mapbaltimore/)
- [baltimoredata](https://elipousson.github.io/baltimoredata/)
- [mapmaryland](https://elipousson.github.io/mapmaryland/)
### Other local area education data
- [CPSenrollpack](https://github.com/cymack/CPSenrollpack): "R package of enrollment data for Chicago Public High Schools, 2006-07 to 2018-19"