This repository contains a data analysis project focused on the US Accident Analysis dataset available on Kaggle. The dataset encompasses information about accidents in the United States between 2016 and 2021.
The primary objective of this project is to conduct an exploratory data analysis (EDA) to uncover patterns and trends in accidents throughout the years. To better understand the data and identify correlations and patterns, the project also incorporates visualizations.
Note: Due to the large size of the dataset and us_accidents_analysis.ipynb file, it cannot be directly displayed on GitHub. Please visit the provided Kaggle link to access the dataset and download the us_accidents_analysis.ipynb file to see the code.
The dataset employed in this project can be found on Kaggle via this link:
Containing 2.25 million accident records in the United States from 2016 to 2021, the dataset includes details such as location, severity, and other accident-related information.
- Python 3.8
- Jupyter Notebook
- Pandas
- Matplotlib
- Seaborn
This analysis allowed us to identify patterns and trends in accidents within the United States from 2016 to 2020. Key findings included a higher prevalence of accidents in urban areas, during peak traffic hours, and under clear weather conditions. Additionally, accident severity was found to be higher on highways and interstates.
In summary, this project serves as a solid foundation for further research on accidents in the United States, offering valuable insights for policymakers to devise effective strategies aimed at reducing accidents and enhancing road safety.