This is the last project in the 100 Days of Code course. The goal of this project was to analyze fatalities where police used force. This project is built on the concepts covered during the data science portion of the course, as well as day 99 of the course.
Although NumPy was listed as one of the starting libraries, I did not use it in my take on this project.
Pandas is used to interact with the CSV files and analyze the data in the files used in this project. This includes data cleaning and exploration options.
Plotly is used in this project to graph various bar, box and pie charts used in the notebook.
MatPlotLib is used in this project to plot various charts used in this project.
Seaborn is used to create various graphs in this notebook. Graphs created include joint plots, linear regression plots, and KDE plots.
The data folder contains the CSV files used in this project. The files stored here are:
- Deaths_by_Police_US
- Median_Household_income_2015
- Pct_Over_25_completed_HighSchool
- Pct_People_Below_Poverty_Level
- Share_of_Race_By_City
This file is the Python notebook used to complete this project. For the project itself, it was completed using a Google Colab Notebook.
The notebook begins by importing the necessary modules and loading the CSV data into variables to be used in the project.
With the data loaded, the first part of this project focuses on preliminary data exploration. This includes gathering information about the shape of the DataFrames, number of rows and columns, column names, and checking for NaN values and duplicates. Cleaning operations are then performed to address the NaN and any duplicate values.
With the DataFrames analyzed and cleaned, the following sections in the notebook focus on performing more analysis and visualization of the data. The subsections in the notebook are as follows and can be further explored in the notebook!
- Poverty rate in each US state
- High school graduation rate by US state
- Relationship between poverty rate and high school graduation rates
- Racial makeup of each US state
- Donut chart of people killed by race
- Number of deaths of men vs women
- Chart showing age and manner of death
- Analyzing whether people were armed when they were killed
- Age, race, and whether people had mental illnesses
- Cities with the most police killings
- Rate of death by race
- Choropleth map of police killings by state
- Number of killings over time
Since this project is completed in a Python notebook, there are no screenshots. However, the notebook has various graphs depicting the data analyzed!