GitHub

This is a collection of notebooks related to baseball analysis. Some are more about the techniques and others are about interesting baseball stuff. The first several notebooks have to do with predicting on-base percentage (OBP). If there is one metric that has persisted into the post-moneyball era into today's statcast era, its the value of OBP. The rest are just things that come to mind and I try them. Some originate from interest in the baseball question and some of them are because I wanted to build a implement a specific model or technique.

NOTE: Some of these require very large Statcast files. The code is usually set to load the data by default, so if there is an error there, change to True and the data will be gathered and saved locally.

Notebook	Description
01_PredictingOBP-ML.ipynb	Prediciting end of season OBP given early season data. Focuses on regression and simple ML techniques
02_PredictingOBP-EmpericalBayes.ipynb	This adapts code from my Emperical Bayes repo to estimate OBP using the same technique used for batting average. Does a shrunken estimate that accounts for plate appearances approximately estimate end of season OBP?
03_PredictingOBP-ARIMA-Forecasting-basic.ipynb	Instead of a classic train/test split using ML, we use a running OBP to try to forecast out to the end of season. This notebook only considers using past OBP to predict future OBP w/o any additional exogenous variables
04__PredictingOBP-ARIMA-Forecasting-addExog.ipynb	Similar to the last notebook but we introduce exogenous variables to facilitate the forecasting.
05_OBP_to_SLG.ipynb	OPS is a common metric but it is often critized since the denominators of on-base % and slugging % are different, making the addition mathematically... eh. My question is, what is the relationship between the two? How much is 1 point of OBP worth compared to 1 point of slugging %?
06_TheBook-Chapter1.ipynb	This notebook replicates some of the tables in Chapter 1 of "The Book" by Tom Tango et al. The data were mined from Baseball Savant and queried using Pandas to make the RE24 table, compute wOBA, and other tables.
07_CricketData.ipynb	A notebook that plays with some cricket data.
08_EstimatingTrueExitVelocity.ipynb	Given some noisy data, how can we estimate a player's true average exit velocity. Especially when the number of plate appearances varies signficantly across players. We use linear models, empericial Bayes, and a hierarchical Bayesian model to address this.
09_PredictingSwingAndMiss.ipynb	This is a simple notebook to see if we can predict a swing and miss from the data. In all honestly it's not the greatest question to address, but it is still interesting. A more interesting question might a continuous variable, like exit velocity, or something more fundamental like the "error", which may be a combination of several parameters. For now it is just an excuse to build a multinomial logistic regression model, use a neural network, and try to implement the same multinomial logistic regression model in a Bayesian framework.
10_PredictingBallInPlay.ipynb	This is similar to the previous notebook, but instead of predicting swing and miss, we are interested in knowing if the ball is put in play. This is also a bit naive, but makes for a straightforward problem. It is also an excuse to build a logistic regression model in PyMC and also to try to build a BART model for the same problem.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
environment_files		environment_files
img		img
models		models
utils		utils
.gitignore		.gitignore
01_PredictingOBP-ML.ipynb		01_PredictingOBP-ML.ipynb
02_PredictingOBP-EmpericalBayes.ipynb		02_PredictingOBP-EmpericalBayes.ipynb
03_PredictingOBP-ARIMA-Forecasting-basic.ipynb		03_PredictingOBP-ARIMA-Forecasting-basic.ipynb
04_PredictingOBP-ARIMA-Forecasting-addExog.ipynb		04_PredictingOBP-ARIMA-Forecasting-addExog.ipynb
05_OBP_to_SLG_ratio.ipynb		05_OBP_to_SLG_ratio.ipynb
06_TheBook-Chapter1.ipynb		06_TheBook-Chapter1.ipynb
07_CricketData.ipynb		07_CricketData.ipynb
08_EstimatingTrueExitVelocity.ipynb		08_EstimatingTrueExitVelocity.ipynb
09_PredictingSwingAndMIss.ipynb		09_PredictingSwingAndMIss.ipynb
10_PredictingBallInPlay.ipynb		10_PredictingBallInPlay.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

jabrantley/Baseball_Notebooks

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages