RTA-accident-analyzer-cloud

This repository shows how its possible to use IBM Watson Studio and build a shiny application that will analyse the driver behavior and location risk. It also explains the business use case using Cognos Dashboards.

Data Engineering

The original dataset was in Arabic and hence had to be translated to English. I used Google sheets and google translate to perform the operation.
The dataset had different age numbers which had to be grouped in 10 different age groups. This was performed on Excel using the Vlookup function.
The data also had to specify the time groups and Vlookup was used for this part as well
Next, only 6 locations needed to be selected, hence Data Refinery was used to filter out the locations

Cognos Dashboards

Insights Gained:

Driver Age:

People between the age 20 - 29 cause lot of accidents and when the injury severity is increased the age group rises to 30 - 39

Accident Timing:

Most of the accidents happen Early in the morning or Morning time. This could be because of the increase in traffic on roads.

Driver profile analysis:

Most of the accidents are caused by drivers in the age 20 - 29 and most of them are students or students in a military school.
Blue Collar occupations such as painters, car drivers, etc. also between the age of 20 - 29 cause a lot accidents. Hence safer driving awarness has to be spread to this community.
These drivers are also the highest number of people who did not wear a seat belt during the accident.

Location and Cause analysis:

Main kind of accidents is Hitting another vehicle
Emirates Road has the most number of accidents
So the main cause of accident at Emirates Road is Hitting another vehicle and hitting into an iron barrier, this could mean there is some design malfunction which needs to be manually accessed by the authorities.

Modelling - SPSS

Driver Risk Model

Various classification models were tried and tested for this case. But most of them gave a very low accuracy. But the best one was XG Boost Trees which gave an accuracy above 75%. Below is the list of the models tested:

XG Boost Trees
Random forest
Neural networks
C5 - decision tree algorithm
Logistic Regression

Before moving into the modelling, the data had to be balanced since the imbalance in the injury severity was causing a class imbalance and reducing the model accuracy. Also, the dataset was partitioned into training and testing. And it had a 95-5 partition.

This was the flow created for this model. This flow can be retrieved from flow2.str file in this repositary:

Location Risk Model

Various classification models were tried and tested for this case. But most of them gave a very low accuracy. But the best one was CHAID which gave an accuracy above 75%. Below is the list of the models tested:

CHAID
Random forest
Neural networks
XG Boost Trees
Logistic Regression
Auto Classifier models

Also, the dataset was partitioned into training and testing. And it had a 95-5 partition.

This was the flow created for this model. This flow can be retrieved from flow3.str file in this repositary:

Shiny Application

Analysing Driver Profile Risk

Features to be analysed:

Age
Gender
Occupation
Driving license issue date
Car manufactured year

Analysing Location Risk

Features to be analysed:

Accident Time
Accident Location
Weather
Type of accident
Cause of accident

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Accident-analyser		Accident-analyser
assets		assets
assettypes		assettypes
pics and gifs		pics and gifs
README.md		README.md
flow1.str		flow1.str
flow2.str		flow2.str
flow3.str		flow3.str
project.json		project.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RTA-accident-analyzer-cloud

Data Engineering

Cognos Dashboards

Insights Gained:

Driver Age:

Accident Timing:

Driver profile analysis:

Location and Cause analysis:

Modelling - SPSS

Driver Risk Model

Location Risk Model

Shiny Application

Analysing Driver Profile Risk

Analysing Location Risk

About

Releases

Packages

Languages

anchalbhalla/transport-accident-analyzer-cloud

Folders and files

Latest commit

History

Repository files navigation

RTA-accident-analyzer-cloud

Data Engineering

Cognos Dashboards

Insights Gained:

Driver Age:

Accident Timing:

Driver profile analysis:

Location and Cause analysis:

Modelling - SPSS

Driver Risk Model

Location Risk Model

Shiny Application

Analysing Driver Profile Risk

Analysing Location Risk

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages