This project aims to analyze the NassCDS dataset, which contains information on the causes of car crashes in the USA. The goal is to identify patterns and trends that could potentially be used to improve road safety and save lives.
- R Programming Language
- RStudio
- ggplot2 and other R packages for data visualization
The NassCDS dataset is publicly available and can be downloaded from the National Highway Traffic Safety Administration (NHTSA) website. The dataset contains detailed information on car crashes in the USA, including the cause of the accident, weather conditions, road type, and many other variables.
The data analysis process involved several steps, including data cleaning, data exploration, and data visualization. The code and documentation for each step can be found in the data-analysis.Rmd
file.
Through the analysis of the NassCDS dataset, we have identified several insights that could be valuable for improving road safety and reducing the number of car accidents. Some of the key findings include:
- The most common causes of car accidents are distracted driving, drunk driving, and speeding.
- Rural roads are more dangerous than urban roads, with a higher incidence of fatal accidents.
- Seatbelt use is strongly correlated with the likelihood of surviving a car crash.
Here are some visualizations of the data:
Overall, this data analysis project demonstrates the potential of using data to identify trends and patterns that could be used to improve road safety and save lives. The insights gained from this analysis could be used to inform policy decisions and public awareness campaigns aimed at reducing the number of car accidents in the USA.