-
In this milestone we will be expected to choose a dataset appropriate for the COSC 301 project. The most important task for this milestone is to select an appropriate dataset.
Group members: Adnan Ahmed, Luca deVerteuil, Ken Woon
Group number: 11
-
Describe your dataset
This data collection lists Vancouver's historical crime data, showing the type of crime, the year, month, date, hour, and minute that they occurred, the location and neighbourhood of the crime scene, and the x and y coordinate values projected in UTM zone 10.
-
What is the source of your dataset?
The dataset is created by Sumaia P. and is posted on Kaggle. The data provided is based on the information contained in the Vancouver Police Department Records Management System. The crime classification and file status may change at any time based on the dynamic nature of police investigations.
-
License: what is the license of your dataset? CC0, MIT, etc...
CC0: Public Domain
-
Rows
793916 rows
-
Columns
10 columns
-
Interests
We are very interested to find out the historical crime trends in Vancouver, as it is a common place to visit, and some of us are planning to move there to work. We will be able to be more prepared to avoid encountering these crimes. The crime data is intended to enhance community awareness of policing activity in Vancouver and will also likely be a valuable reference for the police department of Vancouver to predict crimes that may occur in the future.
-
-
In this milestone we will be expected to setup the repository, clone it to each of our local machine and load the approved dataset.
This entire repository is our milestone 2 product.
- In this milestone we will be expected to process and clean our dataset, do exploratory data analysis (EDA), create some data visualizations, and work with method chaining in Pandas.
- In this milestone we will be finalizing our submission and presenting all our hard-work to our fellow students as a Dashboard!
- In this milestone we will continue working on our class project, process the data for our dashboard, and get the repo ready for final submission.
The dataset selected for this project is created by Sumaia P. and is posted on Kaggle. The data collection lists Vancouver's historical crime data from 2003 to 2021. The table shows the type of crime, the year, month, date, hour, and minute that they occurred, the location (hundred block) and neighborhood of the crime scene, and the x and y coordinate values projected in UTM zone 10. This collection has been released to the public by the Vancouver Police Department (VPD) with the intention of enhancing community awareness of policing activities in the city. The data provided is based on the information contained in the VPD Records Management System. Unfortunately, there is no information on how the data was collected. Still, it can be speculated that the values are based on historical records that have been digitized, as the historical crime data of past years are likely archived into a dataset such as this.
We are very interested to find out the historical crime trends in Vancouver, as it is a city we visit often, and some of us are even planning to move there in the future. By gaining more awareness, we will be able to be more prepared to avoid encountering these crimes. The crime data will also likely be a valuable reference for the police department of Vancouver to predict crimes that may occur in the future and prepare means to handle them better.
Some questions that we would like to explore are:
- Does the total crime committed during the holiday season increase or decrease? (Luca)
- At what times and years do the crimes occur the most? (Ken)
- Which areas are the most targeted? (eg. residential areas, business districts, parking lots, etc.) (Adnan)
- Luca deVerteuil: I am a computer sceince major and I love soccer. Brazil is going to win the next world cup !
- Ken Woon: I am a senior mechatronics and computer science student with many hobbies such as playing sports and learning the guitar.
- Adnan Ahmed: I am a senior civil engineering student who loves learning about new things.
The data set that we will be using is Vancouvers BC Historical Crime Data. More information at this link.