This project calculates the network traveling distance between a source and destination point on a map. It uses Open Source Routing Machine (OSRM) software to generate the map and perform the calculations inside a Docker container.
Project was created by David James for the Institute of Transportation Studies(ITS).
- Program was implemented with Python 3.7 alongside Docker and the OSRM
- It maintains the following dependencies
Packages pandas numpy requests multiprocessing (included) logging (included) - The program needs the Docker container running before the Python script executes
- networkDistance.py
- This file calculates the source and destination pairs of data from a csv and outputs a csv of their distance in meters.
- the source and destination pairs should be in a csv and given in FIPS code values
- results/commutes20191121-0938-ForkPoolWorker-20.EXAMPLE.csv
- This is an example output from the networkDistance.py. It follows a year,month,day format followed by the timestamp and the Worker ID number.
- the output from networkDistance will output four files. These processes are separated to ensure RAM doesn’t crash when running a larger data set. Process may be updated in the future.
Before beginning, you will need to run any bash commands from your terminal.
If you’re running a Mac, search for the application called terminal
.
- Installing Docker
- There are several ways to install Docker depending on your machine follow this link and read the instructions for your machine to get the platform running
- After installing Docker, sometimes priviledges are needed for the container to run on the system
# This command is to make sure that docker has the right priviledges to run sudo usermod -a -G docker $USER $ groups
- NOTE
- I mainly note this if someone is newly installing Docker. The installation instructions covered everything else, but this final command is needed someontimes depending on the environment you’re running in. Once this is executed and Docker is running, then you don’t need to run this command again.
- Collect files
- You will need the map file, so OSRM knows what area to generate
- the map file is downloaded from here go through the subregions until you find the map resolution that you need and download the
.osm.pbf
file# this is the command to get the CALIFORNIA map # if a different region is wanted change the URL, so it pulls the same file wget http://download.geofabrik.de/north-america/us/california-latest.osm.pbf # this command is just the bash way of downloading the file # going to the website and downloading it through the browser will yield the same results
- the map file is downloaded from here go through the subregions until you find the map resolution that you need and download the
- You will need the map file, so OSRM knows what area to generate
- Generate server files
- Create a directory called
map
and insert yourpbf
file into it - From inside the directory run the following command
# These first commands are needed when first running OSRM on your machine # commands take about 5-10 mins to execute # these only need to be run once per machine, unless you need a different type of map docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-extract -p /opt/car.lua /data/california-latest.osm.pbf
- Create a directory called
- Starting the OSRM server
- These commands need to be executed within the
map
directory# These commands are for setting up the OSRM server on your machine # command take about 5-10 mins to execute # these commands need to be run every time you want to boot up the server docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-partition /data/california-latest.osrm docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-customize /data/california-latest.osrm docker run -t -i -p 5000:5000 -v "${PWD}:/data" osrm/osrm-backend osrm-routed --algorithm mld /data/california-latest.osrm # At this point the server is running and you can run the python script now # This command is meant for calling the server # NOTE: it is already implemented in python script, so using this is uncessary # unless you are just trying to make your own specific call curl "http://127.0.0.1:5000/route/v1/driving/slon,slat;dlon,dlat?steps=true" # @param: source - {slon, slat} # @param: destination - {dlon,dlat}
- These commands need to be executed within the
- Access the OSRM server
- The Python script,
networkDistance.py
, has a function that will call the server and collect the responsesimport pandas as pd import simpledbf as dbf import networkDistance as nd # the data will be imported as a DataFrame and only needs two columns # both columns must be strings to be parsed correctly by the program name = 'name-of-file' cols = ['source col', 'dest col'] data = pd.read_csv('path/to/' + name + '.csv',usecols=cols).astype(str) # the dbf file is usually the one collected from the CENSUS.GOV site # reference to the site given in the NOTE below cols = ['GEOID', 'LAT', 'LON'] locs = dbf.Dbf5('path/to/file.dbf') # change the file to a DataFrame df = locs.to_dataframe() # use only the columns desired df = df[cols] # change the Lat, Lon columns to floats df[cols[1:]] = df[cols[1:]].apply(pd.to_numeric) # In case of multiple state files needed place each DateFrame in one dictionary # Have the key values be the two digit State FIPS Code d = {'00':df} # the file name is provided, so when saving results they are correlated with # the file name given nd.mp_networkDriver(data, d, name)
- NOTE
-
- The GEOIDS are meant to be FIPS Codes
- State files found here.
- The Python script,