Skip to content

A comprehensive data analysis of 120 years of modern Olympic history (1896-2016) using Excel. The project explores trends in athlete participation, country representation, and gender dynamics, with an interactive dashboard for visual insights.

Notifications You must be signed in to change notification settings

Deepubhatt/Olympic-History-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pexels-pixabay-236937 new

120 Years of Olympic History Analysis

In this project, I analyzed the historical data from the modern Olympic Games to identify trends and patterns in athlete participation, country representation, and overall performance across different editions of the Olympics. The goal of this analysis was to gain a deeper understanding of how the participation of athletes and countries has evolved over the years.

Dataset

The "120 Years of Olympic History" dataset was sourced from Kaggle. It contains comprehensive historical data on the modern Olympic Games, spanning from the inaugural Athens 1896 Olympics to Rio 2016.

Each row in the dataset represents an individual athlete competing in a specific event and includes details such as the athlete’s name, age, sex, country, medal, and more.

Data Pre-processing and Cleaning

After loading the dataset into Excel, I utilized Power Query to clean and preprocess the data:

  • Removed Unnecessary Columns: Deleted columns like "ID" and "NOC" that were not relevant to the analysis.
  • Corrected Data Types: Converted data types for certain columns (e.g., changing integer columns incorrectly marked as strings).
  • Feature Engineering: Created an additional "Edition" column to uniquely identify each Olympic session by city.

To handle missing values in the age, height, and weight columns, I filled them with the respective mean values, grouped by the event category, as some events have specific combinations for these attributes. Below is an example formula used to fill null values in the "Age" column:

Age_filled

Similar methods were applied to the weight and height columns.

Finally, I have below a sample of stock uncleaned data, followed by the cleaned data.

Uncleaned Data: Uncleaned Data

Cleaned Data: Cleaned Dataset first 20 rows

Exploratory Data Analysis (EDA)

Using pivot tables in Excel, I explored the following aspects:

  • Trends in athlete and country participation over time.

  • Distribution of age, height, and weight among athletes.

Age

Height

Weight

  • Top athletes and countries based on the number of medals won.

  • The ratio of male to female participants over the years.

Gender

  • Cities that hosted the most Olympic editions.

Dashboard Creation

I created a comprehensive dashboard to provide an overview of the analysis:

Final Dashboard

  • Key Performance Indicators (KPIs): Measures for total participants, total medals, and total sports were added to highlight overall performance.

  • Filters for Multilevel Views:

    • Season: To differentiate between the Summer and Winter Olympics.
    • Gender: To compare participation and performance between male and female athletes.
    • Year: To analyze trends over different periods.

Filters in action: Dashboard with Filters

About

A comprehensive data analysis of 120 years of modern Olympic history (1896-2016) using Excel. The project explores trends in athlete participation, country representation, and gender dynamics, with an interactive dashboard for visual insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published