This is the primary repository for code and documentation for most the data sources the Data & Analytics Unit uses.
Each folder is for a different data source (or category of related data sources). They contain:
- an explanation of what the data source is,
- how it can be used, and
- the Python and SQL necessary for our Extract, Load, Transform, and Validate processes into our PostgreSQL database.
For those curious about what data we manage is released on OpenData, see the Open Data Releases.
- Airflow DAGS
- Bluetooth Detectors
- Collisions
- Cycling App (inactive)
- Events
- GIS - Geographic Data
- HERE Travel Time Data
- Incidents (inactive)
- INRIX (inactive)
- Parking (inactive)
- TTC (inactive)
- Volume Data
- Watch Your Speed signs
- Weather
- Open Data Releases
This folder contains the DAG Python files for our Airflow orchestration that dictate the logic and schedule for data pipeline tasks.
The City collects traffic data from strategically placed sensors at intersections and along highways. These detect Bluetooth MAC addresses of vehicles as they drive by, which are immediately anonymized. When a MAC address is detected at two sensors, the travel time between the two sensors is calculated.
The collisions dataset consists of data on individuals involved in traffic collisions from approximately 1985 to the present day (though there are some historical collisions from even earlier included).
The Cycling App collected OD and trip data until 2016.
How does construction and special events impact traffic in the city?
- City road permitting data (RoDARs)
- (oudated) Special events from City's Open Data and TicketMaster
The assets directory stores airflow processes related to various assets that we help manage, such as datasets related to Vision Zero.  Below are the assets that we have automated so far.
Red Light Camera data are obtained from Open Data and are also indicators that are displayed on the Vision Zero Map and Dashboard. We have developed a process using Airflow to automatically connect to Open Data and store the data to our RDS Postgres database. See the README file in assets/rlc for details about this process.
A number of different features of traffic signals (Leading Pedestrian Intervals, Audible Pedestrian Signals, Pedestrian Crossovers, Traffic Signals) are periodically pulled from OpenData . These indicators are used to populate the Vision Zero Map and Dashboard. See the README file in assets/traffic_signals for details about the source datasets and how they are combined into a final table made up of the following data elements.
This dataset comes from Vision Zero which uses Google Sheets to track progress on the implementation of safety improvements in school zones.
Contains SQL used to transform text description of street (in bylaws) into centreline geometries.
Travel time data provided by HERE Technologies from a mix of vehicle probes. Daily extracts of 5-min aggregated speed data for each link in the city (where data are available).
See CityofToronto/bdit_incidents
Data collected from a variety of traffic probes from 2007 to 2016 for major streets and arterials.
This contains R and SQL files for pulling parking lots and parking tickets from Open Data. They might be useful but haven't been documented or automated.
This contains some valiant attempts at transforming CIS vehicle location data provided to us by the TTC on streetcar locations as well as an automated process for pulling in GTFS schedule data.
Miovision currently provides volume counts gathered by cameras installed at specific intersections. There are 32 intersections in total. Miovision then processes the video footage and provides volume counts in aggregated 1 minute bins. Data stored in 1min bin (TMC) is available in miovision_api.volumes whereas data stored in 15min bin for TMC is available in miovision_api.volumes_15min_tmc and data stored in 15min for ATR is available in miovision_api.volumes_15min.
Deprecated. See Vehicle Detector Station (VDS).
volumes/short_term_counting_program/
Short-term traffic counts are conducted on an ad-hoc basis as the need arises, and may be done throughout the year both at intersections and mid-block. Much of this dataset is also available through the internal application MOVE and data go as far back as 1994. As of January 2025, The bulk of this data is now available to the public on the Open Data pages:
- Traffic Volumes - Midblock Vehicle Speed, Volume and Classification Counts
- Traffic Volumes - Multimodal Intersection Turning Movement Counts
The city operates various permanent Vehicle Detector Stations (VDS), employing different technologies, including RESCU, intersection detectors, Blue City and Smartmicro. The most frequently used for D&A context is the RESCU network which tracks traffic volumes on Toronto expressways, about which more information can be found on the city's website or here.
The city has installed Watch Your Speed signs that display the speed a vehicle is travelling at and flashes if the vehicle is travelling over the speed limit. Installation of the sign was done as part of 2 programs: the mobile watch your speed which has signs mounted on existing poles, moved every few weeks, and school watch your speed which has signs installed at high priority schools. The signs also collect continuous speed data.
Daily historical weather conditions and predictions from Environment Canada.
- Travel Times - Bluetooth contains data for all the bluetooth segments collected by the city. The travel times are 5 minute average travel times. The real-time feed is currently not operational. See the Bluetooth README for more info.
- Watch Your Speed Signs give feedback to drivers to encourage them to slow down, they also record speed of vehicles passing by the sign. Semi-aggregated and monthly summary data are available for the two programs (Stationary School Safety Zone signs and Mobile Signs) and are updated monthly. see the WYS README for links to these datasets
- Traffic Volumes - Midblock Vehicle Speed, Volume and Classification Counts: ad-hoc counts of motor vehicle, bicycle, and pedestrian volumes at intersections. see the Short Term Counting Program documentation for more info
- Traffic Volumes - Multimodal Intersection Turning Movement Counts: ad-hoc observations of volumes, speeds, and vehicle classification of motor vehicles travelling along a section of road. see the Short Term Counting Program documentation for more info
For the King St. Transit Pilot, the team has released the following datasets, which are typically a subset of larger datasets specific to the pilot:
- King St. Transit Pilot - Detailed Bluetooth Travel Time contains travel times collected during the King Street Pilot in the same format as the above data set. Data is collected on segments found in the King St. Transit Pilot – Bluetooth Travel Time Segments map layer. See the Bluetooth README for more info.
- King St. Transit Pilot – Bluetooth Travel Time Summary contains monthly averages of corridor-level travel times by time periods. See the Bluetooth README for more info.
- King St. Transit Pilot - 2015 King Street Traffic Counts contains 15 minute aggregated ATR data collected during 2015 of various locations on King Street. See the Volumes Open Data King Street Pilot section for more info.
- King St. Transit Pilot – Detailed Traffic & Pedestrian Volumes contains 15 minute aggregated TMC data collected from Miovision cameras during the King Street Pilot. The counts occurred at 31-32 locations at or around the King Street Pilot Area. See the Miovision Open Data section for more info.
- King St. Transit Pilot - Traffic & Pedestrian Volumes Summary is a monthly summary of the above data, only including peak period and east-west data. The data in this dataset goes into the King Street Pilot Dashboard. See the Miovision Open Data section for more info.