This is not a project, but rather a self-interest initiative to apply what I recently learned through a data science course and to explore new concepts. Through this small effort, I have gained hands-on experience with Jupyter notebooks, Python libraries, and various data visualization techniques.
- This notebook contains Exploratory Data Analysis (EDA) of the IPL 2023 season, analyzing team performances, player statistics, and match insights.
- Various Python libraries such as Pandas, Matplotlib, Seaborn, and NumPy have been used for data manipulation and visualization.
- The analysis includes team-wise and player-wise insights, match trends, scoring patterns, and more.
✔️ How to work with Jupyter Notebooks
✔️ Using Pandas for data manipulation
✔️ Creating visualizations using Matplotlib and Seaborn
✔️ Understanding different data types and structures
✔️ Generating match insights through EDA
- This is not an accurate or professional analysis and may contain inaccuracies.
- Two matches were missing in the dataset, so some insights might be incomplete or incorrect.
- The data used here may not be 100% reliable, and the purpose was learning and experimentation rather than drawing final conclusions.
The dataset used in this analysis was sourced from Kaggle.
🔗 Original Dataset Link: https://www.kaggle.com/datasets/sahiltailor/ipl-2024-ball-by-ball-dataset?select=ipl_2023_deliveries.csv
Before performing any analysis, the dataset was cleaned and preprocessed to ensure a smoother workflow:
✔️ Handling Missing Values – Removed unnecessary columns and dealt with missing or inconsistent data.
✔️ Filtering Relevant Data – Extracted key match details such as batting stats, wickets, extras, and over-wise progression.
✔️ Standardizing Team & Player Names – Ensured uniformity in naming conventions for teams and players.
✔️ Derived Columns – Created additional metrics for cumulative runs, strike rates, and match progression analysis.
🔹 Number of Matches per Season
🔹 Matches Played at Each Venue
🔹 Matches Played by Each Team
🔹 Team vs Team Runs Comparison (Heatmap)
🔹 Top Run Scorers & Top Wicket Takers
🔹 Most Sixes and Fours by Batsmen
🔹 Top Individual Match Scores
🔹 Total Runs Scored by Each Team
🔹 Wicket Types Distribution (Pie Chart)
🔹 Match 13 Worm Graph (Rinku Singh’s Last Over Heroics)
This exploration was not about building a perfect dataset but rather an effort to apply new learnings and understand the workflow of a data analysis project. I now have a better understanding of Python, data visualization, and working with real-world sports datasets.