-
Notifications
You must be signed in to change notification settings - Fork 3
/
README.Rmd
207 lines (161 loc) · 7.31 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
---
output: github_document
---
```{r, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "figures/"
)
devtools::load_all()
```
# General Bikeshare Feed Specification <img src="https://raw.githubusercontent.com/simonpcouch/gbfs/main/figures/hex.png" alt="A hexagonal logo with a black border and white background showing a green bicycle icon." align="right" width=280 />
[![CRAN status](https://www.r-pkg.org/badges/version/gbfs)](https://cran.r-project.org/package=gbfs)
[![R build status](https://github.com/simonpcouch/gbfs/workflows/R-CMD-check/badge.svg)](https://github.com/simonpcouch/gbfs/actions)
[![Testing Coverage](https://codecov.io/gh/simonpcouch/gbfs/branch/main/graph/badge.svg)](https://codecov.io/gh/simonpcouch/gbfs?branch=main)
[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/grand-total/gbfs)](https://cran.r-project.org/package=gbfs)
The `gbfs` package supplies a set of functions to interface with General
Bikeshare Feed Specification .json feeds in R, allowing users to save
and accumulate tidy .rds datasets for specified cities/bikeshare programs.
The North American Bikeshare Association's [gbfs](https://github.com/MobilityData/gbfs)
is a standardized data release format for live information on the status
of bikeshare programs, as well as metadata, including counts of bikes at
stations, ridership costs, and geographic locations of stations and
parked bikes.
__Features__
* Get bikeshare data by specifying city name or supplying url of feed
* All feeds for a city can be saved with a single function
* New information from dynamic feeds can be appended to existing datasets
## Installation
We're on CRAN! Install the latest release with:
```{r cran-installation, eval = FALSE}
install.packages("gbfs")
library(gbfs)
```
You can install the developmental version of `gbfs` from GitHub with:
```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("simonpcouch/gbfs")
```
## Background
The `gbfs` is a standardized data feed describing the current status of a
bikeshare program.
Although all of the data is live, only a few of the datasets change often:
* `station_status`: Supplies the number of available bikes and
docks at each station as well as station availability
* `free_bike_status`: Gives the coordinates and metadata on available
bikes that are parked, but not at a station.
In this package, these two datasets are considered "dynamic", and can be
specified as desired datasets by setting `feeds = "dynamic"` in the
main wrapper function in the package, `get_gbfs`.
Much of the data supplied in this specification can be considered static. If you
want to grab all of these for a given city, set `feeds = "static"` when calling
`get_gbfs`. Static feeds include:
* `system_information`: Basic metadata about the bikeshare program
* `station_information`: Information on the capacity and coordinates of stations
* Several optional feeds: `system_hours`, `system_calendar`, `system_regions`,
`system_pricing_plans`, and `system_alerts`
Each of the above feeds can be queried with the `get_suffix` function, where
`suffix` is replaced with the name of the relevant feed.
For more details on the official `gbfs` spec, see
[this document](https://github.com/MobilityData/gbfs/blob/master/gbfs.md).
## Example
In this example, we'll grab data from Portland, Oregon's Biketown bikeshare
program and visualize some of the different datasets.
```{r, warning = FALSE, message = FALSE}
# load necessary packages
library(tidyverse)
```
First, we'll grab some information on the stations.
```{r}
# grab portland station information and return it as a dataframe
pdx_station_info <-
get_station_information("https://gbfs.lyft.com/gbfs/1.1/pdx/gbfs.json")
# check it out!
glimpse(pdx_station_info)
```
...as well as the number of bikes at each station.
```{r}
# grab current capacity at each station and return it as a dataframe
pdx_station_status <-
get_station_status("https://gbfs.lyft.com/gbfs/1.1/pdx/gbfs.json")
# check it out!
glimpse(pdx_station_status)
```
Just like that, we have two tidy datasets containing information about
Portland's bikeshare program.
Joining these datasets, we can get the capacity at each station, along with each
station's metadata.
```{r}
# full join these two datasets on station_id and select a few columns
pdx_stations <- full_join(pdx_station_info,
pdx_station_status,
by = "station_id") %>%
# just select columns we're interested in visualizing
select(id = station_id,
lon,
lat,
num_bikes_available,
num_docks_available) %>%
mutate(type = "docked")
```
Finally, before we plot, lets grab the locations of the bikes parked in
Portland that are not docked at a station,
```{r}
# grab data on free bike status and save it as a dataframe
pdx_free_bikes <-
get_free_bike_status("https://gbfs.lyft.com/gbfs/1.1/pdx/gbfs.json",
output = "return") %>%
# just select columns we're interested in visualizing
select(id = bike_id, lon, lat) %>%
# make columns analogous to station_status for row binding
mutate(num_bikes_available = 1,
num_docks_available = NA,
type = "free")
```
...and bind these dataframes together!
```{r}
# row bind stationed and free bike info
pdx_full <- bind_rows(pdx_stations, pdx_free_bikes)
```
Now, plotting,
```{r plot}
# filter out stations with 0 available bikes
pdx_plot <- pdx_full %>%
filter(num_bikes_available > 0) %>%
# plot the geospatial distribution of bike counts
ggplot() +
aes(x = lon,
y = lat,
size = num_bikes_available,
col = type) +
geom_point() +
# make aesthetics slightly more cozy
theme_minimal() +
scale_color_brewer(type = "qual")
```
```{r print-plot, eval = FALSE}
pdx_plot
```
<a href='https://github.com/simonpcouch/gbfs'><img src='https://raw.githubusercontent.com/simonpcouch/gbfs/main/figures/plot-1.png' height="500" /></a>
Folks who have spent a significant amount of time in Portland might be able to
pick out the Willamette River running Northwest/Southeast through the city.
With a few lines of `gbfs`, `dplyr`, and `ggplot2`, we can put together a
meaningful visualization to help us better understand how bikeshare bikes
are distributed throughout Portland.
Some other features worth playing around with in `gbfs` that weren't touched
on in this example:
* The main wrapper function in the package, `get_gbfs`, will grab every
dataset for a given city. (We call the functions to grab individual datasets
above for clarity.)
* In the above lines, we output the datasets as returned dataframes. If you'd
rather save the output to your local files, check out the `directory` and
`return` arguments.
* When the `output` argument is left as default in `get_free_bike_status` and
`get_station_status` (the functions for the `dynamic` dataframes,)
and a dataframe already exists at the given path, `gbfs` will row bind the
dataframes, allowing for the capability to accumulate large datasets over time.
* If you're not sure if your city supplies `gbfs` feeds, you might find the
`get_gbfs_cities` and `get_which_gbfs_feeds` functions useful.
## Contributing
Please note that the `gbfs` R package is released with a [Contributor Code of Conduct](CONTRIBUTING.md). By contributing to this project, you agree to abide by its terms.