- Executive Summary
- Project Goals
- Data Sources
- Steps Performed
- Key Insights
- Visualizations and Results
- Quantitative Highlights
- Next Steps
This project analyzes ultra-marathon race data from 2019 to uncover trends in participant demographics, performance, and event characteristics. Insights focus on race distances, athlete performance, and gender-specific trends based on USA races in 2019.
- Clean and preprocess a large historical dataset of ultra-marathon events.
- Explore gender differences in participation and performance.
- Identify top-performing athletes and races based on speed and completion time.
- Analyze seasonal trends and how event distances influence athlete outcomes.
- Provide actionable insights to organizers, participants, and enthusiasts.
- Dataset size: 7,461,195 records with 13 features.
- Focused on USA races from 2019, filtering to standardized distances: 50km, 50mi, 100km, and 100mi.
-
Data Cleaning:
- Filtered for USA races and selected standard distances.
- Addressed missing values and corrected data types.
-
Data Transformation:
- Extracted useful fields like athlete age and country.
- Converted performance times into numeric formats.
-
Visualization and Statistical Analysis:
- Explored demographic trends, gender comparisons, and seasonal participation.
- Analyzed top-performing events and athletes.
-
Demographics:
- Male athletes dominate participation, especially in longer races.
- Average participant age: 41 years.
-
Performance Highlights:
- Fastest 50km finish time: 2.83 hours.
- Best-performing event: Caumsett Park 50K Championships.
-
Race Trends:
- 50km and 50mi races are the most popular.
- Spring and Summer seasons see the highest participation.
-
Gender-Specific Insights:
- Men are faster than women across all distances.
- Women exhibit more consistent speed in races.
-
Participation by Gender Across Distances:
A bar chart showing the number of male and female participants in 50km, 50mi, 100km, and 100mi races.

-
Age Distribution of Athletes:
A histogram showing the distribution of athletes’ ages, highlighting the concentration around 30-50 years.

-
Average Speed by Gender and Distance:
A grouped bar chart comparing average speeds of male and female athletes across distances.

-
Seasonal Participation:
A pie chart displaying the percentage of races held in each season (spring, summer, fall, winter).

-
Top 5 Events with Fastest Finish Times:
A horizontal bar chart listing events with the fastest average completion times.

- Total records in dataset: 7,461,195.
- Filtered dataset size: 88,064 (USA events, 2019).
- Fastest 50km time: 2.83 hours.
- Longest event time: 120 hours (Across the Years event).
- Popular event: AZT Oracle Rumble 50km with 159 participants.
- Extend analysis to multi-year trends for other regions.
- Build predictive models for performance based on historical data.
- Create dynamic dashboards to track trends in real time.