By Gabe Epstein
As part of my Introduction to Data Science course, we are 'generating a tutorial that will walk users through the entire data science pipeline: data curation, parsing, and management; exploratory data analysis; model building as either hypothesis testing and/or machine learning; and then the curation of a message or messages covering insights learned during the tutorial.' This is the repository for my final portfolio! More information about the project and its motivations can be found here: Data Science Final Tutorial.
For my project, I will be analyzing datasets containing various NBA statistics from many seasons. I hope that, through ETL, EDA, and Model Building, I will be able to predict who will be the NBA's Most Improved Player, and perhaps determine what feature or set of features has the most weight in determining the MIP. I will be doing my coding in Python on Google Colaboratory and uploading it here on GitHub. Some of the libraries that will be used include Pandas, NumPy, SQL, Seaborn, and more.
All of the files for my project can be found in this repository, including the datasets used, the Jupyter Notebook in .ipynb format, and any other relevant files I utilize throughout this project. I hope you enjoy!
Please find the Project Website hosted by GitHub Pages here: Final Portfolio Website