Antonio Paez (School of Earth, Environment and Society, McMaster
University, Canada)
email: paezha@mcmaster.ca
Anastasia Soukhov (School of Earth, Environment and Society, McMaster
University, Canada)
email: soukhoa@mcmaster.ca
Accessibility measures are widely used to summarize the ease of reaching potential destinations. As such, they combine, into a single summary measure, properties of the land use system, on the one hand, and the transportation system and travel behavior on the other. Defined as the weighted sum of the opportunities that can be reached given the cost of movement, accessibility is used in transportation planning, health planning, economic analysis, etc.
This workshop introduces spatial availability. Much like accessibility, spatial availability measures the ease of reaching potential destinations. However, unlike accessibility, it makes opportunities available uniquely to members of the population. For example, a job, once it is available to someone, it is no longer available to somebody else. In effect, spatial availability is a singly-constrained accessibility measure that preserves the number of opportunities.
In this workshop, we explain the intuitions behind spatial availability and describe the mechanisms to implement it. A key to this is the idea of proportional allocation, and the use of proportional allocation factors.
The use of proportional allocation factors as a mechanism for constraining the spatial availability means that the results are easier to interpret than those obtained from accessibility analysis, and they are more intuitive as well.
One exercise is provided, meant to be solved by hand. The workshop
finishes with a practical example of implementation in R
. Data from a
real survey in the Greater Toronto and Hamilton Area and use of package
{accessibility} give hands-on practice that can serve as a launching pad
for your own experiments and applications.
-
Packages
-
{accessibility} (Install from CRAN).
-
{dplyr} (Install from CRAN).
-
{ggplot} (Install from CRAN).
-
{leaflet} (Install from CRAN).
-
{patchwork} (Install from CRAN).
-
{sf} (Install from CRAN).
-
Data (sourced from the Transportation Tomorrow Survey of the Greater Toronto and Hamilton Area)
-
{TTS2016R} (Install from GitHub):
remotes::install_github("soukhova/TTS2016R")
A reproducible environment was created using {renv}. The repository provides all the infrastructure to replicate the environment used to create the workshop.
The workshop was developed using R
4.3.2 (“Eye Holes”).
How to use this repository. Install the appropriate version of R
(make
sure to pick the correct operating system) - this is the programming
language, it comes as a ‘core package’ here: https://cran.rstudio.com/
Install RStudio (make sure to pick the correct operating system) - this is the an IDE here: https://www.rstudio.com/products/rstudio/download/.
Download the Rtools43 installer (Windows) and run it to install: https://cran.r-project.org/bin/windows/Rtools/rtools43/rtools.html. If using a Mac this may be achieved through installing Xcode (we don’t have Macs so not sure if this works, but it should!): https://mac.r-project.org/tools/
Download the code as a .zip file from this repository:
https://github.com/paezha/Workshop-Spatial-Availability.
Alternatively, if you work with GitHub, you can fork the repository. The
repository contains a renv.lock
file that specifies all the versions
of the packages used in the workshop.
If you downloaded the repository as a .zip file, unzip the file and store it in an apprioriate directory (i.e., NOT the Downloads folder). Use this folder to work on the workshop.
Double click the file called Workshop-Spatial-Availability.Rproj
. This
is be the R
project for the workshop. This will launch RStudio.
You will see a message in your console saying that your library is out
of synch with the lock file. To ‘restore’ the state of the project to
match the versions specified in the renv.lock
file run the following:
renv::restore()
Run renv::status()
in the R
Console. The output should show a table
where the second and third columns are populated with “y” characters.
This means the versions of packages on your system matches that
specified by the renv.lock file. Congrats!
- Accessibility.
- Impedance function.
- Gravity model.
- Spatial availability.
- Proportional allocation.
- Congestion.
- Transportation Tomorrow Survey.
The main role of land use systems is to organize activities in space. Transportation provides the means to connect spatially disparate activities. Often, the output of these systems is examined in isolation. For example, the operation of transportation systems is measured in units of length, units of mobility (the potential for movement), or in terms of realized movement (such as VKT, PKT, or t-km).
A more holistic approach to understand the efficiency of transportation systems is the concept of accessibility (Handy and Niemeier 1997). Accessibility is a summary measure of the joint performance of a land use-transportation system, and it quantifies the potential to reach spatially disperse opportunities.
Consider the following situation, with a single population center and a single center where opportunities (e.g., jobs) are located.
How accessible are the opportunities? It depends on the travel behavior under a given transportation system.
People may be unwilling to travel too far for some opportunities, even if those opportunities seem essential (i.e., no one travels on a daily basis from Toronto for a job in Vancouver.) And for some people, the cost of reaching opportunities may exceed their willingness to pay.
The travel behavior of a population is modelled using an impedance function (also sometimes called a distance-decay function, or a deterrence function.)
An impedance function captures the effect of the friction of movement: it is a well-known regularity that most people prefer to spend less time/effort/money travelling than more. The impedance function models the propensity to travel by the cost/length/duration of trips. At some point, that propensity becomes zero, when the cost of reaching the opportunity outweighs the benefits that the opportunity brings.
A number of different impedance functions have been used in the literature. The simplest is a binary function as follows:
In the above, is the cost of reaching destination from origin , and is a threshold, that is, the tolerance for long trips. What kind of behavior does this function represent?
This function has been used in numerous studies, including:
- Kawabata and Shen (2006) chose these values because they approximate the mean travel time in the region.
- Korsu and Wenglenski (2010)
- Lin, Chen, and Hsieh (2016)
- Boisjoly, Moreno-Monroy, and El-Geneidy (2017)
An alternative is the inverse power function:
This function was used, for example, by Matas, Raymond, and Roig (2010) for Barcelona and Madrid, with the travel time in minutes and :
The negative exponential function is as follows:
Cheng and Bertolini (2013) used this measure for the case of Amsterdam, with and :
Accessibility is defined as a weighted sum of opportunities:
Given the spatial distribution of opportunities , the propensity to travel , and the cost of travel in a given transportation system , accessibility summarizes the opportunity landscape.
Back to our toy system:
What is the accessibility in these cases?
Modify the set up as follows. How does accessibility change?
What happens if we add a population center? How does accessibility change?
Back to our initial system. Let us ask a slightly different question. How many jobs are taken?
The impedance function models the propensity to travel, and at some point it drops to zero. But that is a summary for the population, and depends on what opportunities are available.
How many jobs are taken in each of these cases?
Could it be that all those jobs remain vacant? What if instead those available jobs were allocated to people who are willing to reach them?
Accessibility is low because it is expensive to reach the opportunities. But, the jobs are there and people are willing to reach them.
The number of jobs available equals the total number of jobs in the region. That is because jobs are allocated proportionally.
Proportional allocationo of opportunities means that, unlike accessibility, the number of opportunities available can change when there are new population centers.
Let us see how this works.
- O2 has opportunities that are available to both P1 and P2. But P1 can reach those opportunities with less “friction” (it is closer).
The total “friction” (impedance) of reaching O2 is:
Jobs are allocated preferentially to those who can reach them with lower friction. Then the jobs available to P1 from O2 are:
And, the jobs available to P2 from O2 are:
The total “friction” (impedance) of reaching O1 is:
Again, opportunities are allocated preferentially to those who can reach them with lower friction. Then the jobs available to P1 from O1 are:
And, the jobs available to P2 from O1 are:
The opportunities available to P1 are as follows:
The opportunities available to P2 are as follows:
And so, the total jobs available are:
The proportional allocation mechanism constrains the available opportunities to match the total number of opportunities in the system. Compare to the accessibility:
Formalizing this, we see that spatial availability, call it , is the weighted and constrained sum of opportunities:
where:
Compare to accessibility:
Let us change the set up to illustrate a different point.
Since the impedance is the same, the previous proportional allocation mechanism would have us give 100 opportunities to P1 and 100 opportunities to P2, which seems a bit unfair, not to mention wasteful.
To improve on this situation, opportunities will be allocated proportionally based on demand.
The proportion of the total population at P1 is:
The proportion of the total population at P2 is:
In this way, the jobs available to P1 from O1 is:
And the jobs available to P2 from O1 is:
This mechanism for allocating opportunities is impartial with respect to the population: the ratio of opportunities to population is identical, as expected:
It also satisfies the following:
As we can verify, the sum of available opportunities is preserved:
When we formalize this we see again that spatial availability is a weighted sum of the opportunities, constrained to match the total:
We have described two proportional allocation mechanisms:
-
By the cost of interaction (through the impedance function)
-
By the size of the population
These are consistent with the principles of spatial interaction (e.g., the gravity model):
-
The friction of space
-
The importance of size (mass)
We call these quantities proportional allocation factors:
and:
The two factors are combined as follows:
to give the following expression for spatial availability:
Compare again to accessibility:
Spatial availability is a measure of the number of opportunities that are available to a location from all reachable destinations. It is a singly constrained measure of accessibility.
- It constrains the sum of available opportunities to match the total number of opportunities in the region
- It is easier to interpret than accessibility
- It provides intuitive results
- It gives a natural baseline for equity analysis
This exercise is meant to be solved by hand. This is a great way to develop a good feeling for the mechanics of the method. See the figure below. It represents a tiny system with only two population centers ( and ) and two employment centers ( and ).
For the exercise use the following impedance function for , where is trip duration in minutes:
Assume that the number of people and jobs at each location is as shown below:
Further, the travel times in minutes between these locations are:
As a reminder, accessibility is:
- What is the global jobs/population ratio in this system?
- What does min represent?
- Calculate the impedance for each population center-employment center pair.
- Calculate the accessibility.
- Tabulate for each population center-employment center pair. What do these values represent?
- Tabulate for each population center. What do these values represent?
- Tabulate for each population center-employment center pair. What do these values represent?
- Calculate the jobs available from each employment center to each population center (i.e., ).
- Calculate the jobs available at each population center (i.e., ).
- Verify that
- Calculate the jobs per capita at each population center (i.e., )
- Compare to the global jobs/population ratio.
- In which of the two population centers is the risk of unemployment higher? Discuss.
- What do you think would happen if was less than one or greater than one? Explain.
The spatial availability is as follows:
Next, suppose that there is an upgrade in the transportation system that shortens the duration of trips between , so the new travel times are:
- Does the accessibility of change? Discuss.
- Explain the changes to spatial availability and spatial availability per capita compared to those calculated using the earlier travel times.
The spatial availability is now as follows:
For this practical example we will use data from the 2016 edition of the Transportation Tomorrow Survey (TTS). This is a comprehensive travel survey conducted in the Greater Toronto and Hamilton Area every five years, and it includes rich information about travel patterns and the socio-demographic composition of the population in the region. For convenience, parts of the survey have been sourced in {TTS2016R}, an open data product. The data are augmented with other information, such as travel time tables, and distributed as analysis-ready data objects.
We will begin by loading the packages used in this example.
library(accessibility) # Transport Accessibility Measures # Transport Accessibility Measures
library(dplyr) # A Grammar of Data Manipulation # A Grammar of Data Manipulation # A Grammar of Data Manipulation
library(ggplot2) # Create Elegant Data Visualisations Using the Grammar of Graphics # Create Elegant Data Visualisations Using the Grammar of Graphics # Create Elegant Data Visualisations Using the Grammar of Graphics
library(leaflet) # Create Interactive Web Maps with the JavaScript 'Leaflet' Library # Create Interactive Web Maps with the JavaScript 'Leaflet' Library
library(patchwork) # The Composer of Plots # The Composer of Plots # The Composer of Plots
library(sf) # Simple Features for R # Simple Features for R
library(tidyr) # Tidy Messy Data # Tidy Messy Data
library(TTS2016R) # An augmented 2016 Transportation Tomorrow Survey (TTS) data package: worker and place of employment counts, trips and estimated travel time to work in the Greater Golden Horsehoe area, Canada # An augmented 2016 Transportation Tomorrow Survey (TTS) data package: worker and place of employment counts, trips and estimated travel time to work in the Greater Golden Horsehoe area, Canada
We will use two objects from the data package, information aggregated at the level of traffic analysis zones (TAZs) and an origin-destination table:
data("ggh_taz") # Traffic analysis zones.
data("od") # Origin-destination table.
The data sets are documented, and you can check the help files like so:
?ggh_ttz
?od
For the exercise we will use a slice of the data, so we will extract the parts of the data corresponding to the City of Toronto. According to the TTS Data Guide of TTS (p. 29), the identifiers of the TAZs in Toronto are 1-625. We will create a vector with those identifiers:
TO_taz <- c(1:625) |>
as.character()
Using the vector we just created we will filter the zones corresponding to the City of Toronto, and then select the three variables that will be used in the example, namely the zone identifier (using the GTA06 zoning system), the number of workers in the zone (work age population), and the number of jobs in the zone:
# Filter the traffic analysis zones in the City of Toronto.
lu <- ggh_taz |>
filter(GTA06 %in% TO_taz) |>
# Rename and select only the three variables needed for the example.
transmute(id = GTA06,
P = workers,
O = jobs)
This selection somewhat simplifies a characteristic of the data set, because not all the workers living in Toronto work there, and not all jobs in Toronto are taken by people who live in Toronto. Given this caveat, we can calculate the jobs/workers ratio in the region:
sum(lu$O)/sum(lu$P)
#> [1] 1.11989
A summary of the table shows that workers in Toronto tend to be more dispersed than jobs. Some TAZs have zero workers (there is no housing in the zone) and some have zero jobs (they are purely residential):
lu |>
# Drop the geometry before reporting the summary.
st_drop_geometry() |>
summary()
#> id P O
#> Length:625 Min. : 0 Min. : 0
#> Class :character 1st Qu.: 697 1st Qu.: 305
#> Mode :character Median :1471 Median : 643
#> Mean :1719 Mean : 1925
#> 3rd Qu.:2442 3rd Qu.: 1704
#> Max. :8491 Max. :41821
The overall jobs/workers ratio is probably a little bit optimistic
because Toronto tends to attract many commuting trips from beyond the
city boundaries. Since table lu
is a simple features object with
geometry, we can map the variables in the form of choroplet maps. The
next two chunks of code are for workers and jobs:
Pi_plot <- ggplot() +
geom_sf(data = lu,
aes(fill = P)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1) +
theme_void() +
ggtitle("Workers per TAZ")
Oi_plot <- ggplot() +
geom_sf(data = lu,
aes(fill = O)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1) +
theme_void() +
ggtitle("Jobs per TAZ")
The two maps can be plotted side-by-side using the syntax of package {patchwork}:
Pi_plot + Oi_plot
The maps show that jobs are highly concentrated in downtown Toronto. The number of inhabitants downtown is not as high, and there are places in the periphery of the core of the city and towards the edges with relatively high populations.
Next, we will prepare the travel time matrix by filtering TAZ that are in Toronto.
# Rename the table.
od_ggh <- od
# Filter the origin and destinations in the City of Toronto.
od <- od |>
filter(Origin %in% lu$id,
Destination %in% lu$id) |>
# Rename and select only the three variables needed for the example.
transmute(from_id = Origin,
to_id = Destination,
travel_time)
Travel times in the table are in minutes. The longest travel time between any origin and any destination is:
max(od$travel_time)
#> [1] 146
We will use an inverse power impedance function of the following form for the example:
We present the example with (try this value first; you can experiment with other values later if you wish). This is the shape of the curve with the initial value of :
# Parameter for inverse power impedance function.
beta <- 1
# Plot impedance function with {ggplot}
ggplot() +
geom_function(fun = function(x) 1/(abs(x)^beta),
xlim = c(1, max(od$travel_time))) +
xlab("t (travel time in min)") +
ylab("f(t)") +
theme_minimal()
Package {accessibility} includes functions to calculate both accessibility metrics and spatial availability. The documentation can be consulted like so:
?accessibility
The inputs are a travel matrix (which in this example is our od
table)
and a land use table (which in our example is the lu
table containing
the opportunities and demand).
We calculate the accessibility first. The function to do this is
accessibiliy::gravity()
. We need to specify the name of the columns
with the opportunities, the cost, as well as the impedance function
(here accessibility::decay_power()
):
# Use table `od` with the travel time information for origin-destination pairs.
Si <- gravity(travel_matrix = od,
# Use table `lu` with the land use information.
land_use_data = lu,
# Use column `O` for the opportunities; this column contains the number of jobs per zone.
opportunity = "O",
# Use column `travel_time` as the cost variable.
travel_cost = "travel_time",
# Use the power impedance function with parameter beta.
decay_function = decay_power(beta)) |>
# Rename the opportunities $S_i$ for accessibility.
rename(Si = O)
Next, we calculate the spatial availability. The inputs are very
similar, with the addition of a variable for the demand (in our case the
population). You will recall that the population is needed to calculate
the proportional allocation factor. The spatial availability can be
reported in a detailed table that gives
,
that is, the number of jobs available to each origin from each
destination. The alternative (detailed_results = FALSE
) returns the
spatial availability aggregated by zone.
# Use table `od` with the travel time information for origin-destination pairs.
Vij <- spatial_availability(travel_matrix = od,
# Use table `lu` with the land use information.
land_use_data = lu,
# Use column `O` for the opportunities; this column contains the number of jobs per zone.
opportunity = "O",
# Use column `travel_time` as the cost variable.
travel_cost = "travel_time",
# Use column `P` for the demand; this column contains the number of workers per zone.
demand = "P",
# Use the power impedance function with parameter beta.
decay_function = decay_power(beta),
# Return the detailed results.
detailed_results = TRUE) |>
# Rename the opportunities $V_i$ for spatial availability.
rename(Vij = O)
If the detailed results are requested, the spatial availability by origin can be computed as follows:
Vi <- Vij |>
# Group by origin; this means that the operations in summarize() will be over the destinations.
group_by(from_id) |>
# Sum the opportunities available from every destination
summarize(Vi = sum(Vij)) |>
# Rename the column to `id`
rename(id = from_id)
Join the results to the zoning system:
results <- lu |>
left_join(Si,
by = "id") |>
left_join(Vi,
by = "id")
Verify the total of jobs in the region, the jobs accessible, and the jobs available:
# Sum of jobs in land use table,
sum(lu$O)
#> [1] 1202986
# Sum of jobs accessible.
sum(results$Si, na.rm = TRUE)
#> [1] 15421360
# Sum of jobs available.
sum(results$Vi, na.rm = TRUE)
#> [1] 1202843
We see that the jobs in the City of Toronto is preserved (with a small
difference caused by zones that cannot be reached). The sum of
accessibility in contrast is two orders of magnitude larger than the
number of jobs in the table. This complicates the interpretation of the
results. When we obtain a summary of these results we find that the
maximum accessibility for any one zone is several hundred thousands of
jobs, compared to less than sixteen thousand jobs according to spatial
availability (notice the 75 NA values; these are zones with a population
of zero). Recall that the maximum number of jobs in any one zone (check
the summary of the lu
table above) is 41,821. Why are the values of
the accessibility so high?
results |>
st_drop_geometry() |>
select(Si, Vi) |>
summary()
#> Si Vi
#> Min. : 42.88 Min. : 0.037
#> 1st Qu.: 12475.44 1st Qu.: 579.082
#> Median : 21153.41 Median : 1370.369
#> Mean : 28038.84 Mean : 2186.987
#> 3rd Qu.: 34883.20 3rd Qu.: 2575.955
#> Max. :134653.00 Max. :21152.968
#> NA's :75 NA's :75
Accessibility, in addition, has much greater range of variation, which indicates that the high sum of accessibility is not the result of a few extraordinary values. The interquartile range (a measure of dispersion) of is 22,407.76 but only 1,996.87 for . The latter value is more in line with the interquartile ranges of the population (1,996.87) and jobs (1,399).
Maps of accessibility and spatial availability can be plotted using package {ggplot2}. This code is for the map of accessibility:
Si_plot <- ggplot() +
geom_sf(data = results,
aes(fill = Si)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1) +
theme_void() +
ggtitle("Accessibility")
This code is for the map of spatial availability:
Vi_plot <- ggplot() +
geom_sf(data = results,
aes(fill = Vi)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1) +
theme_void() +
ggtitle("Spatial Availability")
The two maps can be plotted side-by-side using the syntax of package {patchwork}:
Si_plot / Vi_plot
As expected, the map of spatial availability in noticeably flatter with its smaller interquartile range. High levels of accessibility do not necessarily translate into high spatial availability in the same places. Why is that? High accessibility creates a somewhat paradoxical effect: the ease to reach destinations also means that there is more competition for the same opportunities.
But, could it be that accessibility is just some sort of scaled-up version of spatial availability? That somehow the two give similar results but for some multiplicative constant? To answer this, we can plot the ratio of accessibility to spatial availability. If the two measures were similar but for the scale, we would expect that ratio to be more or less constant. However, what we see is that the ratio of to indicates that in some cases accessibility is tens, hundreds, or even thousands of times greater than spatial availability. This suggests that is not simply a scaled-up version of but something different.
ggplot() +
geom_sf(data = results,
aes(fill = Si/Vi)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1,
trans = "log10") +
theme_void()
We can further explore the results if we calculate the number of accessible jobs per capita () and spatially available jobs per capita (:
results <- results |>
mutate(si = Si/P,
vi = Vi/P)
This is the summary of these two indicators.
results |>
st_drop_geometry() |>
select(si, vi) |>
summary()
#> si vi
#> Min. : 0.1992 Min. :0.00283
#> 1st Qu.: 8.5324 1st Qu.:0.51484
#> Median : 12.4421 Median :0.79628
#> Mean : 19.2023 Mean :0.85798
#> 3rd Qu.: 18.2990 3rd Qu.:1.11207
#> Max. :735.8087 Max. :2.66881
#> NA's :75 NA's :75
Accessibility per capita is not particularly meaningful: as seen from the summary above, accessibility per capita does not particularly resemble the ratio of jobs-to-workers in the city. This is a result of accessibility being a sum of unconstrained opportunities. The summary reveals that 75/% of zones have more than 8.53 jobs accessible per person, which clearly is not a realistic estimate of the ease of reaching jobs while keeping in mind that many others will be doing the same. Spatially available jobs per capita do show some variability but they are more in line with the global jobs-to-workers ratio. It is not implausible to think that some zones, being very well served by the transportation system and/or facing little competition, can have around two jobs available per worker.
Returning to the map of spatial availability, notice its resemblance to the distribution of the population. The correlation between these two variables is quite high:
cor(results$P, results$Vi, use = "pairwise.complete.obs")
#> [1] 0.933404
This should not be surprising since the spatial availability depends directly on the population: part of the proportional allocation mechanism works to ensure that opportunities are proportionally available by population. That said, the global jobs-to-workers ratio provides a useful reference; in the next figure, we plot the population vs the spatial availability. The black line passes through the origin and has a slope of 1.1198902, that is, equal to the global jobs-to-workers ratio. If a zone had as many jobs available to its population as the global ratio the point would be on the line; points below the line are zones with fewer jobs available than what they would have, given their population, if the jobs were equally distributed. In contrast, points above the line are zones with populations with more jobs available to them than their equal share.
ggplot(data = results |>
drop_na(Vi),
# Plot P in the x axis and Vi in the y axis.
aes(x = P, y = Vi)) +
# Plot the data as points, use alpha < 1 to control the transparency of the points.
geom_point(aes(color = Vi/P - sum(lu$O)/sum(lu$P),
size = Vi/P)) +
geom_point(aes(size = Vi/P),
shape = 1,
alpha = 0.2) +
# Plot a line with a given intercept and slope.
geom_abline(intercept = 0,
slope = sum(lu$O)/sum(lu$P),
#color = "orange"
) +
scale_color_gradient2("Dev. from global jobs-to-workers ratio") +
scale_size("Spatial availability per capita (v_i)") +
theme_minimal()
Many zones with small populations have lower spatial availability to employment. Not always, but frequently. Zones with large populations tend to enjoy higher employment availability. Not a single zone with population greater than 5,000 has fewer jobs available than workers. The map below shows zones with populations greater than 5,000.
ggplot() +
geom_sf(data = lu,
aes(fill = P>5000)) +
theme_minimal()
What could be the reason for this? Zones with large populations are not necessarily centrally located, and some are quite peripheral. But how well connected are they? To answer this question we need calculate the travel time to the opportunities available. To do this we join the travel times to the detailed spatial availability results:
results_2 <- Vij |>
# Join o-d travel times to detailed spatial availability results.
left_join(od |>
select(from_id, to_id, travel_time),
by = c("from_id", "to_id"))
Next, we calculate the total travel time to available opportunities for every origin-destination pair:
results_2 <- results_2 |>
# Calculate the total travel time to opportunities available.
mutate(total_time_opp_av = Vij * travel_time)
We can now calculate the average time to available opportunity by origin. This quantity represents the time typically needed to reach an available opportunity:
results_2 <- results_2 |>
# Group by origin.
group_by(from_id) |>
# Compute the spatial availability by origin. The mean time is the total travel time by origin to spatially available opportunities, divided by the opportunities available to that origin.
summarize(Vi = sum(Vij),
mean_time = sum(total_time_opp_av)/Vi) |>
# Rename the id.
rename(id = from_id)
The scatterplot below is of zonal population vs mean travel time to available opportunity by origin. The horizontal line is the system-wide mean travel time to available opportunity:
results_2 |>
# Join the population values to table `results_2`
left_join(results |>
select(id, P),
by = "id") |>
mutate(vi = Vi/P) |>
# Pass to ggplot, where the population will be plotted on the x axis and the mean travel time to available opportunity in the y axis.
ggplot(aes(x = P,
y = mean_time)) +
# The color of the points will depend on the deviation from the global mean of the travel time to available opportunity.
geom_point(aes(color = mean_time - mean(results_2$mean_time),
size = vi)) +
# For ease of visualization add silhouettes for the points
geom_point(aes(size = vi),
shape = 1,
alpha = 0.3) +
# Plot a horizontal line to reres
geom_hline(yintercept = mean(results_2$mean_time)) +
# Change the title of the legends to something informative
scale_color_gradient2("Dev. from global mean travel time") +
scale_size("Spatial availability per capita (v_i)") +
theme_minimal()
The plot shows that zones with larger populations tend to be consistently are not necessarily the best connected, but nonetheless do not deviate much from the mean travel time to available opportunities: notice how zones with populations greater than 5,000 are much closer to the horizontal line (system-wise mean travel time to available opportunity). Of the smaller zones, quite a few tend to be have, on average, short travel times to available opportunties (which is reflected in the large values of ), but many are poorly connected to available opportunities, requiring on average longer travel times, which is associated with lower values of .
The same is true of : the ratio of to indicates it is not simply a scaled-up version of but rather something different:
ggplot() +
geom_sf(data = results,
aes(fill = si/vi)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_viridis_c(direction = 1, trans = "log10") +
theme_void()
The values of spatial availability per capita can be compared to the overall jobs/workers ratio. In this plot we use a diverging gradient with the mid-point set at that overall value. Shades of red indicate that the spatial availability per capita is below the regional level, and shades of blue indicate a value greater than the overall jobs/workers ratio. This way we can explore deviations from equality:
ggplot() +
geom_sf(data = results,
aes(fill = vi)) +
geom_sf(data = lu,
fill = NA) +
scale_fill_gradient2(midpoint = sum(lu$O)/sum(lu$P)) +
theme_void() +
ggtitle("Spatial Availability per capita")
The code in the next two chunks uses package {leaflet} to create interactive maps to further explore the results. Some zones had high accessibility, and yet relatively low availability. This could be caused by the competition for opportunities in well-connected central parts of Toronto.
Map to explore accessibility:
bins <- quantile(results$si, na.rm = TRUE) |> as.numeric()
pal <- colorBin("RdBu", domain = results$si, bins = bins)
labels <- sprintf(
"<strong>%s</strong><br/>%g workers<br/>%g jobs accessible<br/>%g jobs / person",
results$id,
results$P,
results$Si,
results$si
) %>% lapply(htmltools::HTML)
leaflet(data = results |> st_transform(crs = 4326 )) |>
addPolygons(
fillColor = ~pal(si),
weight = 2,
opacity = 1,
color = "white",
dashArray = "3",
fillOpacity = 0.7,
highlightOptions = highlightOptions(
weight = 5,
color = "#666",
dashArray = "",
fillOpacity = 0.7,
bringToFront = TRUE),
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"))