A visual example of the concepts of under and overfitting in supervised machine learning using U.S. state border data provided by the U.S. Census Bureau.
This repository contains a jupyter notebook and D3 visualization that showcases visual examples of the concepts of under and overfitting. In the notebook, data is processed, models are fit, and decision surfaces visualized, as shown below.
The D3 visualization can be used to explore decision surfaces generated in the notebook interactively. On the left are toggles for different views, and on the right selections for models and their respective decision surfaces.
This repository and its contents serve as a learning exercise on the effects of under and overfit models' generalizability to new data - using U.S. states helps because the appropriate geographic shapes are familiar to many.
From this repository's root directory, do the following to start the notebook:
cd notebooks
jupyter notebook
Click Under and Overfitting by Example - 48 U.S. Contiguous States.ipynb
.
To start the D3 visualization, the following will work using Python 3:
cd visualization
python3 -m http.server
Go to localhost:8000
in your browser. The initial load may take a few
seconds while the data is processed - a map will appear when ready.
Use the toggle switches in the top left to alter views and the selections
in the top right to select between models.
The d3 visualization can be a little slow to render depending on the browser and available resources. This can likely be addressed with a thorough refactor.
This project is licensed under the MIT license.