Taking fast.ai's Practical Deep Learning for Coders course notebooks repository and putting it into a Docker container.
Pre-loaded with Jupyter and all the required dependencies (installed in a conda
environment) for an all-in-one automated, repeatable deployment without any setup.
For those that lead a team, scale out by deploying the environment to multiple users at once via JupyterHub, hosted on your own Kubernetes cluster.
This is a standalone deployment which can be extended or used as-is for your own multi-user Jupyter workflows.
*See the Further Reading section for more details on the above mentioned technologies.
Table of Contents
# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
TAG=latest
docker run -p 8888:8888 teozosa/jupyterhub-fastbook:${TAG}
Note: This will automatically pull the image from Docker Hub if it is not already present on your machine; it is fairly large (~5 GB), so this may take awhile.
Follow the directions on-screen to log in to your local Jupyter notebook environment! 🎉
Note: the first URL may not work. If that happens, try the URL beginning with http://127.0.0.1
Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the fastbook
environment
Please see the unabridged Kubernetes deployment section for an in-depth explanations of the below steps
From the root of your repository, on the command line, run:
# Generate and store secret token for later usage
echo "export PROXY_SECRET=$(openssl rand -hex 32)" > .env
# Install Helm
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
# Verify Helm
helm list
# Add JupyterHub Helm charts
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
# Deploy the JupyterHub service mesh onto your Kubernetes cluster
# using the secret token you generated in step 1.
# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest
kubectl --namespace jhub get all
JUPYTERHUB_IP=$(kubectl --namespace jhub get service proxy-public -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $JUPYTERHUB_IP
Type the IP from the previous step into your browser, login*, and you should now be in the JupyterLab UI! 🎉
* JupyterHub is running with a default dummy authenticator so entering any username and password combination will let you enter the hub.
Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the fastbook
environment
-
Immediately get started on the fast.ai Practical Deep Learning for Coders course without any extra setup via the
JupyterHub-Fastbook
Docker image[1] -
- To your Kubernetes cluster[0] via the official Helm chart.
- [Optional] Using Github Oauth for user authentication
-
- Use the deployment as-is; you get a fully-featured JupyterHub deployment that just so happens to have fast.ai's Practical Deep Learning for Coders course dependencies pre-loaded.
- Extend the configuration and deployment system in this project for your particular needs.
- Build and push your own
JupyterHub-Fastbook
images to your own Docker registry.
[0] Tested with Microk8s on Ubuntu 18.04.4.
[1] Based on the official jupyter/minimal-notebook from Jupyter Docker Stacks. This means you get the same features of a default JupyterHub deployment with the added functionality of an isolated fastbook
conda
environment.
Use JupyterHub-Fastbook
in conjunction with the fast.ai Practical Deep Learning for Coders course:
- To go through the course on your own
with virtually no setup by running the
JupyterHub-Fastbook
Docker image locally. - As the basis for a study group
- To onboard new junior members of your organization's AI/ML team
Or anything else you can think of!
The purpose of this project was to reduce any initial technical barriers to entry for the fast.ai Practical Deep Learning for Coders course by automating the setup, configuration, and maintenance of a compatible programming environment, scaling that experience to both individuals and groups of individuals.
In the same spirit as the course, if you don't need a PhD to build AI applications, you also shouldn't need to be a DevOps expert to get started with the course.
We've done all the work for you. All you need to do is dive in and get started!
-
When running the Docker image as a container in single-user mode, outside of Kubernetes, you will interact directly with the Jupyter Notebook interface (see: Quickstart: Running the Docker image locally).
-
The JupyterHub Kubernetes deployment portion of this project is based on the official Zero to JupyterHub with Kubernetes guide and assumes you have your own Kubernetes cluster already set up. If not and you are just starting out, Minikube is great for local development and Microk8s works well for single-node clusters.
build Build Docker container
config.yaml Generate JupyterHub Helm chart configuration file
deploy Deploy JupyterHub to your Kubernetes cluster
push Push image to Docker Hub container registry
Tip: invoking make
without any arguments will display auto-generated
documentation similar to the above.
In addition to deployment, the makefile
contains facilities to build and push
Docker images to your own repository. Simply edit the appropriate fields in Makefile
and invoke make
with one of: build
, push
.
Enabling GitHub Oauth[2]
Determine your JupyterHub host address (the address you use in your browser to access JupyterHub) and add it to your .env
file
JUPYTERHUB_IP=$(kubectl --namespace jhub get service proxy-public -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "export JUPYTERHUB_IP=${JUPYTERHUB_IP}" >> .env
Follow this tutorial: GitHub documentation: Building OAuth Apps - Creating an OAuth App, then:
GITHUB_CLIENT_ID=$YOUR_GITHUB_CLIENT_ID
GITHUB_CLIENT_SECRET=$YOUR_GITHUB_CLIENT_SECRET
echo "export GITHUB_CLIENT_ID=${GITHUB_CLIENT_ID}" >> .env
echo "export GITHUB_CLIENT_SECRET=${GITHUB_CLIENT_SECRET}" >> .env
# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest
Now, the first time a user logs in to your JupyterHub instance, they will be greeted by a screen that looks like this:
Once they click "Authorize", users will now automatically be authenticated via GitHub's Oauth whenever they log in.
[2] see: JupyterHub documentation: Authenticating with OAuth2 - GitHub
source: JupyterHub documentation: Setting up JupyterHub
Note: commands in this section should be run on the command line from the root of your repository.
echo "export PROXY_SECRET=$(openssl rand -hex 32)" > .env
If you need to store these values in version control, consider using something like SOPS.
source: JupyterHub documentation: Setting up Helm
- Download and install
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
- Verify installation and add JupyterHub Helm charts:
# Verify Helm
helm list
# Add JupyterHub Helm charts
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
source: JupyterHub documentation: Setting up JupyterHub
Generate a JupyterHub configuration file*
make config.yaml
This will create a config.yaml
by populating fields of config.TEMPLATE.yaml
with the pre-set deployment variables† and values specified in your .env
file.
* Anything generated here will be overwritten by the following deployment step with the most recent values, but this step is here for completion's sake.
Once you've verified config.yaml
contains the correct information,
on the command line, run:
# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest
This will deploy the JupyterHub instance to your cluster via the
official Helm chart,
parametrized by pre-set deployment variables† and
the config.yaml
file you generated in the previous step.
† to override a pre-set deployment variable, simply edit the appropriate value in Makefile
.
The makefile
defaults to strong versioning of image tags (derived from Google's Kubeflow Central Dashboard Makefile) for unambiguous container image provenance.
Unless you are pushing and pulling to your own registry, you MUST override the generated tag with your desired tag when deploying to your own cluster.
fast.ai
: A non-profit research group focused on deep learning and artificial intelligence.
fastai
: The free, open-source software library from fast.ai that simplifies training fast and accurate neural nets using modern best practices.
-
Practical Deep Learning for Coders: the creators of
fastai
show you how to train a model on a wide range of tasks usingfastai
and PyTorch. You’ll also dive progressively further into deep learning theory to gain a complete understanding of the algorithms behind the scenes.
Jupyter Notebook
: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
JupyterHub
: A multi-user version of the notebook designed for companies, classrooms and research labs
Anaconda
(conda
for short): A free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment.
Docker
: A set of platform-as-a-service products that use OS-level virtualization to deliver software in packages called containers.
- A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.
Kubernetes
: An open-source system for automating deployment, scaling, and management of containerized applications.
Neither I nor my employer are affiliated in any way with fast.ai, Project Jupyter, or any other organizations responsible for any of the technologies used in this project.