Leonardo Cavalcante Araújo, Vinamrata Yadav, Natalia Calderón
Data Analytics Full-Time FEB2021, Paris & March 12nd 2021
Group project developed in trio, during a weekend and 2 weekdays (totalising 4 days).
The project had 2 distinct objectives:
- Derive statistically significant insights from a database.
- Model a regression analysis for a variable (in this project, we have chosen to do use the linear regression to predict the probability of a crime to happen in a given date with some given circunstances.)
- Database search and download, finally deciding on a open source database from the Chicago Data Portal - Crimes from 2001 to Present. The resulting database had 20 years of observations, totalising 7.5 million rows.
- Data Cleaning and filtering for the past 5 years (2015-2020), resulting in a database of around 1.5 million observations.
- Data Analysis & Visualisations: Using
Python
,Matplotlib
andSeaborn
. - Hypothesis Testing: to test statistically significant events.
- Linear Regression using OLS (Ordinary Least Squares): to predict crimes happening in a given date with known circonstances.
- Assumptions testing: verification of the assumptions for the OLS model.
- Presentation: slides construction and oral presentation to our Ironhack Cohort.
- Repository "https://github.com/leo-cavalcante/crimes-in-chicago": you may find the main Python Notebooks produced by the team members to realize the analysis, visualisations and predictive models.
- Group project.
- Leonardo: full Data Cleaning, some data visualisations, 1 Hypothesis Test, the whole Linear Regression (using OLS), plus a big part of the Google Slides presentation.
- Vina: some data analysis, some data visualisations, 2 hypothesis tests and some slides in the Google Slides presentation.
- Natalia: research of database and some interesting insights, some data analysis and a few slides of the final presentation.
Here you may find the relevant links for the main documents produced during this project:
Chicago Crimes - Google Slides Final Presentation
GitHub Repository: crimes-in-chicago
Crimes in Chicago - Geographical Analysis
Crimes in Chicago - Typology of Crimes and Arrests
Crimes in Chicago - Crimes per Communities
Crimes in Chicago - Time Analysis
PS.: only the main files have been mentioned in this section, nevertheless the repository contains also other auxiliary files.