Local Traffic, Statistical Summaries and Inference.

Monirah abdulaziz
10 min readJan 2, 2021



Problem Statment

Kingdom of Saudi Arabia (KSA) is the largest country in the Arab states and a member of the “Group of Twenty” (G-20) major world economies. The motor vehicle rate in this country has increased rapidly since the oil boom in the early 1970s, consequently, the number of roads and the transport infrastructures service increased.
In Saudi Arabia car is the main means of transportation, it provides the flexibility and freedom that people really value and want. Road traffic accidents are one of the most critical public health problems worldwide.
The WHO Global Status Report on Road Safety reports that the annual fatality rate per 100,000 people due to traffic accidents in the KSA has increased from 17.4 to 27.4 since the last decade, which is the worst among the countries in the region and is significantly above death rates for other G-20 nations such as the USA, the United Kingdom, Japan, and Australia. The economic losses due to traffic accidents are estimated to be approximately 4.3% of the KSA’s GDP. A study conducted by Turki et al. suggests that more than 19 individuals have lost their lives daily, and approximately 4 people were injured every hour due to road traffic accidents on KSA roads.

This article takes a look at the number of traffic accidents and driving licenses issued in Saudi Arabia (2016–2017), identifies trends in the data, and combines/analyzes provided datasets with outside research to identify likely factors influencing the outcomes of traffic accidents in the various regions in Saudi Arabia.


In this project, two datasets were provided:

  1. Driving Licenses:
    This dataset contains Saudi Arabia Driving Licenses Issued By Administrative Areas from 1993–2016. Data from the General Authority for Statistics.

2. Traffic Accidents and Casualties
This dataset contains Saudi Arabia Traffic Accidents and Casualties by Region for 2016-2017. Data from the General Authority for Statistics.

Data Import & Cleaning

Identify, diagnose, and treat a variety of dirty data and forming it into an appropriate form for the statistical analysis.

This what the data looked like after the cleaning process:

merged cleaned dataframe.

Exploratory Data Analysis

To give a quick overview of each numeric feature.

Visualize the data

Visualizing the data allows us to quickly convey the findings (even for a non-technical audience), it will often reveal trends in data that escaped when we were looking only at numbers.

Some plots used in this project:

The above heatmap illustrated the strong correlation between accident cases and death and injury cases, which means accidents usually lead to others, and the injury cases lead to death cases.

Histogram is a representation of the distribution of numerical data, all above histograms are right-skewed histogram that has a peak that is left of center and a more gradual tapering to the right side of the graph. This is a unimodal data set, with the mode closer to the left of the graph and smaller than either the mean or the median. The mean of right-skewed data is to the right of the peak and it is a greater value than either the median or the mode. This shape indicates that there are a number of data points, perhaps outliers.

Car accidents in 2016 and 2017 killed more than 16500 people, causing over 80,000 injuries. The most deadly accidents happened in Makkah (3884 death cases), followed by Riyadh (2829 death cases), then the Eastern region (2076 death cases), while the minimum numbers of deadly accidents were in Northern Border (305 death cases), Al-Baha (319 death cases), and Najran (367 death cases).

The most injured accidents happened in Makkah (23006 injured cases) which is a very high number comparing to other regions, followed by the Eastern region (8966 injured cases), then Riyadh (8747 injured cases) while the minimum numbers of the injured cases were in Northern Border (1005 injured case), followed by Najran (1472 injured cases), then Hail (1705 injured cases).

A scatter plot is a type of plot to display values for typically two variables for a set of data. This scatterplot illustrates the traffic accidents for 2016 vs 2017, the total number of accidents in 2017 is 13.6% less than in 2016, the maximum number of accidents in 2016 was in Riyadh, while the maximum number of accidents in 2017 was in Makkah. Najran and Al-Baha are the lowest in both years.

A boxplot is a standardized way of displaying the distribution of data based on a five-number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”).

The above boxplot illustrates the central tendency and spread in both dead and injured variables, there are some outliers shown, the highest outlier in the dead boxplot represents the number of death cases occurred in Makkah (2243 death cases), while the highest one in the injured boxplot represents the number of injury cases that occurred in Makkah (12383 injured cases) followed by the next one which is also in Makkah (10623 cases)

Descriptive and Inferential Statistics

Descriptive statistics describe data (for example, a chart or graph) and inferential statistics allows to make predictions (“inferences”) from that data. With inferential statistics, we take data from samples and make generalizations about a population.

The above histogram shows that the driving license is positively skewed, A positive-skewed distribution is one whose right tail is longer or fatter than its left, When a distribution is positively skewed, the mean is greater than the median, which is greater than the mode.
Positive skew: mode < median < mean

Although the distribution is not normal in previous variables, The central limit theorem (CLT) is a theorem that gives us a way to turn a non-normal distribution into a normal distribution. It tells us that, even if a population distribution is non-normal, its sampling distribution of the sample mean will be normal for a large number of samples (at least 30).

Investigate trends in the data

This section examines the scientific data, to see the overall trend of driving licenses based on years (1993–2017) and traffic accidents numbers for years (2016–2017)

Riyadh got the first four places in terms of the number of driving licenses issued during (1993–2017) followed by Makkah.

In terms of the number of driving licenses issued during (1993–2017), Tabouk and Al-Baha get the lowest places by 915, 997 respectively.

The above is a part of the table that shows the regions that have more Driving Licenses issued each year than the year average, During 1993–2017, Al-Qaseem -5 years-, Makkah -each year-, Riyadh -21 years-, Eastern -21 years-, Assir and Hail -one year-, these are the regions that have more driving licenses issued in each year than the year average. Makkah, Riyadh, Eastern, and Al-Qaseem, these administrative regions are the most populous. The slightly higher than mean in driving licenses in Hail happened in 2016, and In Assir happened in 2015.

The above table shows the regions that have more driving accidents occurred in each year than the yearly average during 2016–2017. The highest number of accidents was in Makkah: 145,541 accident happened in 2017, followed by Riyadh 141736 accidents happened in 2016.

Outside Research

Doing outside research on provincial and central policies that might influence the trends in the number of driving licenses and traffic accident rates. The data is borrowed from the General Authority for Statistics and the Statistical Centre for the GCC.

1. Vision 2030
In fact, various preventive measures have been effectively adapted to reduce the burden of traffic accidents in Saudi Arabia, the Ministry of the Interior (MoI) has set a strong strategy to reduce traffic accidents as part of the government’s Vision 2030 program.

As shown in the above plot, the number of total accidents dropped from over 460000 accidents in 2017 to 287781 accidents in 2019, The ratio down almost 38% percent.

2. Increase the demand for driving licenses

As shown in the below plot, the number of total driving license is getting higher, it raised from 928165 licenses in 2017 to 1026569 licenses in 2019, The ratio up 10.6% percent.

3.The development of the number of death cases in the GCC countries

As shown in the above plot, the GCC occupied the penultimate position, followed by Malaysia. While Japan, Australia, and UE were at the forefront at the lowest traffic accident mortality rate.

Some policies and engineering measures in these leading countries in an attempt to understand the reasons behind their good performance:


The Traffic Safety Policies Law in Japan requires the government to report to the Diet, each year, on the status of traffic crashes, on measures being implemented and on plans for traffic safety measures. This is contained in the ‘White paper on traffic safety in Japan’:
-The aim is a crash-free society.
-Dealing with the issue of human error in public transportation: by improving the organizational structures and systems of companies providing transport services.
-Encourage participatory traffic safety activities by enabling citizens to participate in the planning stages of traffic safety measures run by national and local authorities.


The White Paper A New Deal for Transport: Better for Everyone in the UK made it clear that merely building more new roads is not the answer. The emphasis is now on making the best use of the existing network, giving priority to treating the places with the worst safety, congestion, and environmental records. Key elements of the approach include recognition that good engineering reduces the risk of accidents.

Key takeaways and recommendations

Rapid development, increased number of vehicles together with population growth, are all contributing to a rise in the number of road accidents, injuries, and fatalities. Although road safety remains a low priority for most governments in the developing world, there is a growing awareness of the social, economic, and public health problems caused by traffic accidents.

  • The number of total accidents is getting lower, it dropped from over 460000 in 2017 to 287781 in 2019, The ratio down almost 38% percent, through the National Transformation Program and the Kingdom’s Vision 2030, with the participation of relevant sectors, which contributes to unifying efforts and speed of achievement.
  • The number of total driving license is getting higher, it raised from 928165 in 2017 to 1026569 in 2019, The ratio up 10.6% percent. the relationship between the number of licenses and the population is a positive correlation, in addition to allowing women to drive a car. On 26 September 2017, King Salman issued an order to allow women to drive in Saudi Arabia.


  • Assess the problem, policies, and institutional settings relating to road traffic injury and the capacity for road traffic injury prevention in each region.
  • Specific actions are needed to prevent road traffic crashes and to minimize their consequences. These actions should be based on sound evidence and analysis of road traffic injuries, be culturally appropriate and tested locally, and form part of the national strategy to address the problem of road accidents, especially in cities that had the most accidents and deaths: Makkah, Riyadh, and Eastern province.

Additional Data

  • Driver age in each accident.
  • Explain the causes of each accident:
    human factors/personnel error.
    — malfunction or failure of vehicle structures or other systems.
    — deficient maintenance.
    — hazardous environment involving weather, animal, birds, etc.
  • Drive a vehicle without a valid driver’s license or not.
  • Driving under the influence (DUI) or not.



No responses yet