NBA Predicter

This project contains a data pipeline which collects NBA statistics and uses box score data to predict whether a given team will win or lose a given game. This project also compares various machine learning classifier algorithms to demonstrate which are most accurate for the prediction process.

Introduction

The long-term goal of this project is to use all relevant available data and tune the best performing machine learning algorithms to identify an optimal method for predicting the outcome of future basketball games.

Current version of this package uses the box score data of a specific team (and not their opponent) to predict the win/loss outcome for a game which has already been played. Of course, the box score data for a given game would not be available to predict future games. The purpose of this step of modeling is to do the following:

identify the best candidate features for future methods,
identify the best potential machine learning algorithms for outcome prediction,
offer a comprehensive data pipeline which is easy to use, modify, and update.

Data Visualization and Accuracy Results

The following heatmap and bar graph help us identify candidate features (NBA stats) to use in the classification phase of the pipeline. For instance, the bar graph shows that game outcome has a high positive correlation with made_field_goals and field_goal_percentage, and a high negative correlation with personal_fouls, suggesting that these features should be used in modeling. (See NBA Predictor Jupyter Notebook to generate these plots.)

Using a few of these features, we see that the following algorithms perform with the accuracies indicated.

Prerequisites

This project requires Python 3 and the following packages:

sklearn
pandas
seaborn
basketball_reference_web_scraper

You can find the web scraper at https://github.com/jaebradley/basketball_reference_web_scraper.

Running Tests

To run the entire data pipeline on your local machine, just follow the NBA Predictor Jupyter Notebook.

Future work

Add new data to dataset: advanced statistics, number games on the road, etc.
Use WEKA machine learning models

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
data_preprocessed/team_box_scores		data_preprocessed/team_box_scores
data_raw		data_raw
images		images
src		src
.gitignore		.gitignore
NBA Predicter.ipynb		NBA Predicter.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA Predicter

Introduction

Data Visualization and Accuracy Results

Contents

Prerequisites

Running Tests

Future work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NBA Predicter

Introduction

Data Visualization and Accuracy Results

Contents

Prerequisites

Running Tests

Future work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages