In the real world, a dataset with no missing values doesn't exist...So in this notebook, we explore different ways of dealing with it.
The dataset is dealing with more than 71,000 observations with 17 columns The variables are : INCIDENT_NUMBER, OFFENSE_CODE, OFFENSE_CODE_GROUP, OFFENSE_DESCRIPTION, DISTRICT, REPORTING_AREA, SHOOTING, OCCURRED_ON_DATE, YEAR, MONTH, DAY_OF_WEEK, HOUR, UCR_PART, STREET, Lat, Long, Location
We are using the following libraries: pandas, seaborn, numpy and missingno. Through bar plot, dendogram and matrix plot we can have a clear insight about the missing values and how remove useless variables