Dump of all datasets found in the dataset catalog @ https://data.pr.gov to disk. There are 148 datasets at the moment of the initial commit 2017-07-25. Please remember your disk space!
PRs are welcome!
All created files are saved to the data_files
directory using the following steps:
- Fetches the catalog of datasets from https://data.pr.gov/data.json
- Saves the dataset catalog to disk with a timestamp.
- Consumes dataset catalog and downloads all distributions for each dataset.
- All downloaded files will be named
data.{file_type}
- All downloaded files will be named
- Install pipenv 'cause we fancy.
- Initialize a Python 3 virtual environment
pipenv --three
- Install dependencies
pipenv install
- Activate the virtual environment
pipenv shell
- Execute
python data_pr_downloader.py
- Run
./build.sh
to build docker image - Run
./run.sh
to fetch data. Files will be downloaded in thedata_files
directory.