Skip to content

Latest commit

 

History

History
272 lines (202 loc) · 13.6 KB

README.md

File metadata and controls

272 lines (202 loc) · 13.6 KB

UK Road Crash Analysis

Period of time: from 2021 to 2022

01 - INTRODUCTION

  • This repository contains the datasource, an EDA SQL script, a data cleaning SQL script and an interactive Tableau workbook for visulization.

  • The datasource has 307.973 rows and 19 columns.

  • Link to the Tableau dashboard

image

02- REQUIREMENTS

Details

- **Initial requirements**:

Clients want to create a UK Road Crash dashboard for the years from 2021 to 2022 so that they can have the following insights with the hope in minimizing the loss of lives.

  • KPIs

📌 Total Casualties, categorized into levels of accident severity, which took place after the accident per year

📌 Total Casualties with respect to level of accident severity and total casualties by type of vehicle per year.

📌 Total Casualties with respect to vehicle type per year.

📌 Monthly trend showing a comparison of casualties for the current year and the previous year.

📌 Total Casualties by Road type.

📌 Distribution of total casualties by road surface.

📌 Relation between Casualties by Area, Location & by Day/ Night.

03 - TOOL USED

Details

  • SQL Microsoft Server for EDA and Data Cleaning Process.

  • Tableau Desktop for Data Visualization in parallel with Data Analysis.

04 - STEPS

Details

Step 1 - Data Cleaning by Excel

The raw csv file contained inconsistent data type within column "accident_date", which need to be fixed by using Excel.

Step 2 - EDA using SQL

I started with exploratory data analysis to deeper understand about the data patterns, inconsistences and potential insights. This step drives effective decision-making, aids in data cleaning, and informs appropriate analysis methods for enhanced project outcomes.

Step 3 - Data Cleaning using SQL

This step involves getting rid of data inconsistencies, errors, and duplicates, ensuring the accuracy and reliability of the data I'm using for subsequent analysis.

Step 4 - Data Processing

This step contains generating additional columns by organizing, sorting, and filtering the data to derive valuable insights.

Step 5 - Data Analyzing and Data Visualizing

I conducted analysis and visualization simultaneously to grasp insights quickly, validate results, communicate effectively, and refine hypotheses iteratively. It aids in comprehensive understanding and facilitates efficient data exploration and interpretation.

Step 6 - Dashboard Building

Finally, I translate all the insights into an interactive Tableau Public dashboard, enabling users to engage with the data.

05 - Insights from SQL EDA Script (associated with each KPIs)

Details

Note: Audience can easily find the completed EDA Script in the attached file name: 03_UKRoadCrash_SQLQuery_DataCleaningScript.sql

✅ The Total Casualties occurring post the accident amount to 417.883 and was categorized into 3 groups of the level of accident severity including Slight, Serious and Fatal with retrieved numbers are 351.436, 59.312,7.135; respectively.

SELECT	SUM(number_of_casualties) AS "Total Casualties",  COUNT(DISTINCT accident_index) AS "Total Accidents",
	SUM(CASE WHEN accident_severity = 'Slight' THEN number_of_casualties END) AS "Total Slight Casualties",
	SUM(CASE WHEN accident_severity = 'Serious' THEN number_of_casualties END) AS "Total Serious Casualties",
	SUM(CASE WHEN accident_severity = 'Fatal' THEN number_of_casualties END) AS "Total Fatal Casualties"
FROM [UK-Road-Crash-DB]..uk_road_crashSELECT SUM(number_of_casualties) AS CY_Casualties
FROM [UK-Road-Crash-DB] .. uk_road_crash
  • Result EDA_02

✅ Total Casualties, categorized into levels of accident severity, which took place after the accident, per year

SELECT 
	YEAR(accident_date) AS "Year",
	SUM(number_of_casualties) AS "Total Casualties",
	COUNT(DISTINCT accident_index) AS "Total Accidents",
	SUM(CASE WHEN accident_severity = 'Slight' THEN number_of_casualties END) AS "Total Slight Casualties",
	SUM(CASE WHEN accident_severity = 'Serious' THEN number_of_casualties END) AS "Total Serious Casualties",
	SUM(CASE WHEN accident_severity = 'Fatal' THEN number_of_casualties END) AS "Total Fatal Casualties" 
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result

EDA_05

✅ Total Casualties with respect to each accident severity and total casualties by each type of vehicle per year. Severity Level: Slight

SELECT	YEAR(accident_date) AS "Year", 
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Slight Casualties by Agricultural Vehicle",
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Slight Casualties by Bus",
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Slight Casualties by Car",
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Slight Casualties by Motorcycle",
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END)           AS "Slight Casualties by Others",
	SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Slight Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result:

EDA_09

Severity Level: Serious

SELECT	YEAR(accident_date) AS "Year", 
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Serious Casualties by Agricultural Vehicle",
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Serious Casualties by Bus",
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Serious Casualties by Car",
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Serious Casualties by Motorcycle",
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END)          AS "Serious Casualties by Others",
	SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Serious Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result:

EDA_10

Severity Level: Fatal

SELECT	YEAR(accident_date) AS "Year", 
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Fatal Casualties by Agricultural Vehicle",
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Fatal Casualties by Bus",
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Fatal Casualties by Car",
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Fatal Casualties by Motorcycle",
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END)          AS "Fatal Casualties by Others",
	SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Fatal Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result: image

✅ Total Casualties with respect to vehicle type per year.

SELECT	YEAR(accident_date) AS "Year", 
	SUM(CASE WHEN vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Total Casualties by Agricultural Vehicle",
	SUM(CASE WHEN vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Total Casualties by Bus",
	SUM(CASE WHEN vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Total Casualties by Car",
	SUM(CASE WHEN vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Total Casualties by Motorcycle",
	SUM(CASE WHEN vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END) AS "Total Casualties by Others",
	SUM(CASE WHEN vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Total Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result: image

✅ Monthly trend showing a comparison of casualties for the current year and the previous year.

SELECT Year(accident_date) AS "Year", DATENAME(MONTH, accident_date) AS Month_Name, SUM(number_of_casualties) AS total_casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date), DATENAME(MONTH, accident_date)
ORDER BY "Year"
  • Result:

    image

✅ Total Casualties by Road type.

SELECT	YEAR(accident_date) AS "Year",
	SUM(CASE WHEN road_type LIKE 'Single%' THEN number_of_casualties END) AS "Single Carriagetway",
	SUM(CASE WHEN road_type LIKE 'Dual%' THEN number_of_casualties END) AS "Dual Carriagetway",
	SUM(CASE WHEN road_type LIKE 'Roundabout%' THEN number_of_casualties END) AS "Roundabout",
	SUM(CASE WHEN road_type LIKE 'one%' THEN number_of_casualties END) AS "One way street",
	SUM(CASE WHEN road_type LIKE 'Slip%' THEN number_of_casualties END) AS "Slip road",
	SUM(CASE WHEN road_type IS NULL THEN number_of_casualties END) AS "Null",
	SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result

image

✅ Distribution of total casualties by road surface condition

SELECT	YEAR(accident_date) AS "Year",
	ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Dry%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Dry",
	ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Frost%' OR road_surface_conditions LIKE 'Snow%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Frost/Snow",
	ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Flood%' OR road_surface_conditions LIKE 'Wet%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Wet",
	ROUND(SUM(CASE WHEN road_surface_conditions IS NULL THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Unknown",
	SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
  • Result:

    image

✅ Relation between Casualties by Area.

SELECT	YEAR(accident_date) AS "Year",
	ROUND(SUM(CASE WHEN urban_or_rural_area = 'Rural' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Rural",
	ROUND(SUM(CASE WHEN urban_or_rural_area = 'Urban' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Urban"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date)
  • Result:

image

✅ Relation between Casualtiesby by Day/ Night.

SELECT	YEAR(accident_date) AS "Year",
	ROUND(SUM(CASE WHEN light_conditions LIKE 'Day%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Day",
	ROUND(SUM(CASE WHEN light_conditions NOT LIKE 'Day%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Night"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date)
  • Result:

image

✅ Relation between Casualtiesby by Location.

SELECT	TOP 10
		    local_authority, SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY local_authority
ORDER BY Total_Casualties DESC
  • Result:

image