This is a repository to reproduce the analyses and figures in https://arxiv.org/abs/2407.00404, under its parent project.
Python (version 3.10) code and R (version 4.0.2) code were used to analyse and visualize the data. The stays have been detected via infostop (version 0.1.11) and pyspark (version 3.5.1).
Other Python modules can be found in requirements.txt
The data supporting the findings of this study were purchased from PickWell and are subject to restrictions due to licensing and privacy considerations under the European General Data Protection Regulation. Consequently, these data are not publicly available. Venue locations and categories were obtained from OpenStreetMap. Anonymized aggregated data and code to reproduce our results are provided in this repository.
The repo contains the scripts (src/
), libraries (lib/
) for conducting the data processing, analysis, and visualisation.
The original input data are stored under dbs/
locally and intermediate results are stored in a local database.
The aggregated data directly used for visualisation and statistical tests in the manuscript are stored under data/
.
The produced figures are stored under figures/
.
Under src/
, the scripts are stored by their functionality, with the first number indicating the
order of running the script.
This is because some later analysis may depend on earlier steps.
src/data_etl/
do data extraction and preprocessing.src/feature_eng/
compute metrics of mobility and segregation.src/data_exp/
explore the data, produce descriptive statistics, and conduct statistical analysis.src/simulations/
include the two counterfactual simulations, random mixing scenario, and group exposure analysis.src/visualization/
produce figures inserted in the manuscript.