Advanced SQL analytics demonstrating window functions, CTEs, and cumulative calculations — techniques directly applicable to financial services reporting
This project showcases advanced SQL techniques through baseball operations data analysis. The queries demonstrate skills commonly required in banking and fintech analytics roles: identifying top performers, calculating running totals, tracking tenure, and analysing trends over time.
- Identifying top 20% of teams by average annual spending (NTILE)
- Cumulative spending over time (running totals)
- First year each team exceeded $1 billion total investment
- Career length and tenure calculations
- Starting vs ending team identification (FIRST_VALUE, LAST_VALUE)
- Long-term loyalty patterns (10+ year same-team players)
- Top schools producing professional players
- Decade-over-decade trend analysis (LAG)
- Distribution analysis across categories
- Physical attribute trends over decades
These SQL patterns translate directly to financial services:
| Baseball Scenario | Banking Equivalent |
|---|---|
| Top 20% teams by spending | High-value customer identification |
| Cumulative salary spending | Running account balance / Cumulative AUM |
| First year spending exceeded $1B | Time-to-threshold analysis (first £1M deposit) |
| Player career tenure | Customer relationship length |
| Same team loyalty (10+ years) | Long-term customer retention analysis |
| Decade-over-decade trends | YoY / MoM performance reporting |
| Workforce composition | Portfolio composition / Customer segmentation |
-- Percentile ranking
NTILE(5) OVER (ORDER BY avg_spend DESC) AS spend_quintile
-- Running totals
SUM(total_spend) OVER (PARTITION BY teamID ORDER BY yearID) AS cumulative_spend
-- Period-over-period comparison
LAG(avg_weight) OVER (ORDER BY decade) AS previous_decade_weight
-- First/last value identification
FIRST_VALUE(teamID) OVER (PARTITION BY playerID ORDER BY yearID) AS first_team- Modular, readable query structure
- Multi-step transformations
- Reusable intermediate results
- Date manipulation and tenure calculations
- Conditional aggregation with CASE WHEN
- Multi-table joins
- Filtering with HAVING clauses
| File | Description |
|---|---|
baseball_operations_analytics.sql |
Complete SQL queries organised by business theme |
-- Queries are written for MySQL
-- Each section can be run independently
-- Comments explain the business logic and banking parallelsFinancial services analytics relies heavily on these exact patterns:
- Running totals → Account balances, cumulative deposits, AUM tracking
- NTILE/percentiles → Customer tiering, risk segmentation
- LAG/LEAD → Month-over-month changes, trend analysis
- FIRST_VALUE → Customer acquisition date, first transaction
- Tenure calculations → Customer lifetime, relationship length
Demonstrating production-ready SQL for business analytics