____ _ ____ _ _
/ ___| ___ (_)_ __ __ _ | _ \ _ _| |_ ___| |__ ___________
| | _ / _ \| | '_ \ / _` | | | | | | | | __/ __| '_ \ / / /\
| |_| | (_) | | | | | (_| | | |_| | |_| | || (__| | | | /____/_____/ \
\____|\___/|_|_| |_|\__, | |____/ \__,_|\__\___|_| |_| / / /\ /\
_ _|___/ ____ ____ _ _ /____/_____/ \/ \
___ _ __ | |_| |__ ___ / ___| _ \| | | | \ \ \ /\ /
/ _ \| '_ \ | __| '_ \ / _ \ | | _| |_) | | | | \____\_____\/ \/
| (_) | | | | | |_| | | | __/ | |_| | __/| |_| | \ \ \ /
\___/|_| |_| \__|_| |_|\___| \____|_| \___/ \____\_____\/
Boilerplate for fractional GPU sharing on Kubernetes
By: Alexander Comerford (alexanderjcomerford@gmail.com)
This repo lets you get up and running executing your containerized workloads with fractional GPU sharing. In a few simple steps you can setup your own GPU enabled Kubernetes cluster on your machine.
More and more Data Scientists, Machine Learning Engineers, and Developers are shifting to "containerizing" their projects and using GPU hardware to accelerate them. With these new capabilities it's easy to forget the importance of making the most of your hardware. This project is an ode to effective resource utilitzation and a message to programmers to take into consideration the resources allocations their software needs.
This repo assumes that you have libvirt and vagrant installed, and have your machine configured to do PCIe passthrough (check references for more).
This repo is run in a step-wise fashion. At every step/command, the is an accompanying verification step to ensure the step ran successfully.
This first step uses lspci
to parse device data into an environment file
$ ./bin/00_host_get_gpu_devices.sh
You can verify that devices have been found by cat
ing the file PCI_GPUS.env
$ cat ./PCI_GPUS.env
After checking you have GPU(s) available, some vagrant
plugins
will need to be installed to talk with libvirt
and do some extra networking
$ ./bin/10_host_setup_vagrant.sh
Once the GPU(s) have been saved and plugins installed, bring up the vagrant machine
and run the nvidia-docker
+ cuda
setup script.
$ source PCI_GPUS.env && vagrant up
$ vagrant ssh -c "./bin/20_guest_setup_cuda_docker.sh"
To test that the nvidia-toolkit
is successfully installed, run nvidia-smi
in a docker container
$ vagrant ssh -c "./bin/21_guest_test_docker_runtime.sh"
You should see the nvidia-smi
table with your GPU(s) listed
Next the command arkade
is needed to install some utility kubernetes based clis
$ vagrant ssh -c "./bin/32_guest_install_arkade.sh"
Here are some extra tools if you feel inclined (optional)
$ vagrant ssh -c "./bin/31_guest_install_golang.sh"
$ vagrant ssh -c "./bin/33_guest_install_docker_compose.sh"
$ vagrant ssh -c "./bin/34_guest_install_ctop.sh"
Once the tools have been installed inside the VM, k3s
can be spun up inside the
VM. This script launches k3s
with the DevicePlugins
feature gate so we can interact
with the GPU from k3s
.
$ vagrant ssh -c "./bin/40_guest_setup_k3s_cluster.sh"
To verify that the cluster is up and running, check the status of all the pods for the cluster
$ vagrant ssh -c "kubectl get pods -A"
After ensuring that pods are being created, deploy a local registry with docker
so images that are built locally can be accessed within the k3s
cluster.
$ vagrant ssh -c "./bin/41_guest_setup_docker_registry.sh"
Once the registry is deployed, run this script to push some base images to the registry so they can be easily accessed throughout the cluster when running the samples
$ vagrant ssh -c "./bin/42_guest_push_base_images_to_registry.sh"
gpu-manager
is an amazing projects that will let us create virtual GPUs that can be assigned to our containers (If you want to learn more about gpu-manager
, check out the references). This script build a fresh image of gpu-manager
$ vagrant ssh -c "./bin/50_guest_setup_gpu_manager.sh"
Verify that gpu-manager
has been installed correctly by running a pods with fractional resources
$ vagrant ssh -c "./bin/51_guest_test_gpu_manager.sh"
If you managed to run all these steps, congratulations! You successfully created a Kubernetes cluster with fractional GPU sharing!
To demonstrate some of the gpu sharing capabilities, build then run the sample pods under yml/samples/
$ vagrant ssh -c "./bin/60_guest_build_sample_images.sh"
Run each sample with some time spaced so the scheduler has time to resync
vagrant ssh
Inside VM
for f in $(find yml/samples/*)
do
kubectl apply -f $f;
sleep 1;
done
You can view the memory usage of each process with nvidia-smi
-
Libvirt and PCIe passthrough
- Github tutorial: https://github.com/bryansteiner/gpu-passthrough-tutorial
- GrayWolfTech video tutorial: https://www.youtube.com/watch?v=dsDUtzMkxFk
- Chris Titus Tech video tutorial: https://www.youtube.com/watch?v=3yhwJxWSqXI
-
gpu-manager