Skip to content

Latest commit

 

History

History
267 lines (184 loc) · 7.71 KB

README.md

File metadata and controls

267 lines (184 loc) · 7.71 KB

HaxML

On-demand machine learning predictions for HaxClass data.

Goal

Create and deploy machine learning models to predict “expected goals” (XG) in the online game HaxBall. XG is the probability of a given kick resulting in a goal and can be used to create higher-order metrics, for both offense and defense.

Project Structure

This repository contains all the tools we need to build machine learning models for HaxBall:

  • Analysis: Exploring data to find insights, creating offline reports.
  • Modeling: Engineering features, training and evaluating classifiers.
  • Serving: Deploying models to make predictions and explanations on-demand.
haxml
├── data/               Data for analysis and modeling (not committed).
├── haxml/              Python modules for analysis, modeling, and serving.
|   ├── evaluation.py
|   ├── prediction.py
|   ├── utils.py
|   └── viz.py
├── models/             Saved classifiers for use in modeling and serving.
├── notebooks/          Jupyter notebooks for analysis and modeling.
├── scripts/            Scripts to download and prepare data.
└── server/             Flask app for serving predictions on-demand.

Data

Data is collected through a scraper built with the HaxBall headless API, from a hosted HaxBall room and stored in a Firebase real-time database. Data is stored in a "packed" format and can be "inflated" using the haxml.utils.inflate_match(packed) method.

Refer to the HaxClass repository for the schema of the inflated match data.

Players' gameplay data and usernames are collected, but not their chat messages. HaxML strips user names from the data when downloading from the database, but on-demand predictions retain the user names.

Visit the HaxClass Hub to browse data from recent HaxBall matches in our hosted room and view XG time plots from the current production model.

Helpful Commands

First Time

You may need to install git, Python, pip, virtualenv, and make to run these commands. We will use virtualenv to manage a virtual environment and use local dependencies instead of global dependencies. We will use make to download our data following reproducible steps.

# Clone the repository.
git clone https://github.com/vingkan/haxml.git
# Enter the folder.
cd haxml
# Create a virtual environment called venv.
virtualenv venv
# Activate the virtual environment.
source venv/bin/activate
# Install project dependencies.
pip install -r requirements.txt
# Make an .env file for environment variables.
touch .env
# Pause to ask Vinesh for the credentials.
# Edit the .env file and add the credentials.
# Download the data.
make
# Start a Jupyter notebook server.
jupyter notebook
# Enter Ctrl+C to shut it down.
# Start the Flask app.
python server/__init__.py
# Enter Ctrl+C to shut it down.

Start Each Session

Activate your virtual environment.

source venv/bin/activate

During Each Session

Start a local Jupyter notebook server.

jupyter notebook

Run the prediction server.

python server/__init__.py

Install the latest dependencies, if requirements.txt changed since your last session.

pip install -r requirements.txt

Update dependencies if you added a new dependency that others should have.

pip freeze > requirements.txt

End Each Session

deactivate

Getting Data Access

To download data from our database, you will need our read-only Firebase credentials. Ask Vinesh for the credentials and then add them to your .env file. This file is excluded in .gitignore to avoid accidentally committing them.

To create an .env file, run:

touch .env

And then edit it, to add the credentials:

PORT=5000
firebase_apiKey=YOUR_SECRET
firebase_authDomain=YOUR_SECRET
firebase_databaseURL=YOUR_SECRET
firebase_projectId=YOUR_SECRET
firebase_storageBucket=YOUR_SECRET
firebase_messagingSenderId=YOUR_SECRET
firebase_appId=YOUR_SECRET

Getting Data

Makefile defines how the data files in this project are created. To create the data yourself, run:

make

This will trigger all the necessary commands and show you progress. Once data files are made, running make will not overwrite them. To build the data from scratch, run:

make clean
make

Testing Server

When you make a change to the server, you may want to manually test that the API routes work.

You can set a URL parameter on the HaxClass frontend that will make it use your local server instead of the production server.

First, start the prediction server.

python server/__init__.py

Then, open one of the two pages listed below, with the localml=true parameter added to the URL.

To verify that the frontend is making requests to your local server:

  • You should see logs for the requests in your terminal running the local server.
  • You can also open the browser developer console (right-click > inspect element > console tab).
    • There should be a message saying Fetching predictions from local server. in the console.

Option A: Leaderboard Page

Shows stats and XG for top players in the room based on recent matches, only uses the default model.

https://vingkan.github.io/haxclass/hub/leaderboard.html?localml=true

Click the button labeled Show Extended Stats above the "Ranked Players" table and it will start sending requests to your local server.

Option B: Match Page

Shows stats and XG for a given match, allows the user to switch to any model in the model configs.

https://vingkan.github.io/haxclass/hub/xg.html?m=-MQsAFNKGdFPM9tTfFgv&localml=true

To change models, type a new model ID in the Model input and hit enter.

Using Git

Ask Vinesh to be added as a collaborator to the repository before trying to commit your work.

Fetch the latest version of the repository and pull it into your version. Do this when you have no uncommitted changes.

git fetch origin
git pull

Create your own branch and push it to the "remote" repository (called origin, by convention).

git checkout -b your_branch_name
git push -u origin your_branch_name

Switch between branches.

git checkout other_branch_name

Merge changes from main branch into your branch (commit your work first).

git checkout your_branch_name
git merge main

Check the status of files you have changed in this commit. Make sure you meant to change these files and that no files that you meant to ignore are included.

git status

Add all your changed files to the commit, create a message, and then push it.

git add -A
git commit -m "Your descriptive commit message."
git push

Deploying

The HaxML Flask app is currently hosted on Heroku, on the free tier. You can wake it up by going to:

https://haxml.herokuapp.com/hello

Heroku is set to automatically build and deploy when we commit to the main branch of this repository. We will work in our own branches, submit pull requests, and then merge into main to deploy.

Procfile defines how the Flask app will start with the web command. If you have Heroku installed locally, you can test it by running:

heroku local web

If you have access to the haxml project on Heroku, you can check the Flask app logs with this command:

heroku logs --tail -a=haxml

And you can check usage of dyno hours on the free tier with this command:

heroku ps -a=haxml

We also have a Digital Ocean Droplet where we can run the HaxClass data collection system and deploy the HaxML Flask app, albeit without an SSL-secured domain. The IP address is:

104.236.21.173