This is a work in progress Docker image for a talk demonstrating processing videos with GPUs.
It starts with the NVIDIA Pytorch container, and then builds ffmpeg and OpenCV 4.0 from source with hardware acceleration.
Running it via nvidia-docker
gives us hardware access to the GPU, and lets us keep our host operating system clean / independent.
This repo uses the latest release of Python 3 and Pytorch, adding hardware acceleration for the latest consumer NVIDIA GPU at this time (the 2080ti). Please note, in order for the GPU accelerated encoding and decoding to work, you'll also need to have at least the Linux Driver
For accerated encoding and decoding, your host machine must have the NVIDIA accelerated hardware encoder and decoder installed. The command to install is listed below, but may be different depending on the driver version you have installed.
As for building the image, right now there are just two things to be aware of. In our make
, we're doing -j4
, for the 4 CPUs I have on my dev machine. You may want to change this to something higher than that for running locally.
Besides that, this takes a while to build. So uh, grab a cup of coffee or two...
$ sudo apt-get install libnvidia-decode-390 libnvidia-decode-390
$ docker build -t ffmpegpytorch .
$ nvidia-docker run -it ffmpegpytorch /bin/bash
Once in the container, we can then run Python, and see our shell:
$ python3
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 19:16:44)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> inputVideo = cv2.VideoCapture('video.MOV')
>>> ok, frame = inputVideo.read()
>>> frame.shape
(1080, 1920, 3)
Remember, you can also mount directories and open up ports if you want to run Jupyter notebook, for example:
$ nvidia-docker run -p 8888:8888/tcp -it -v localdir:/workspace/localdir_in_container ffmpegpytorch /bin/bash
I've added the start of an inference API. The way you call it is:
$ curl --header "Content-Type: application/json" --request POST --data '{"filename": "/downloads/cuckoo.mp4", "postback_url": "http://10.152.183.141:5005/video-inference"}' http://10.152.183.139:5007/video-inference
The postback_url
will get posted with the results of the inference in the video.
There are now two new Python files that can be used to create snippets
of videos featuring a specific feature in a video. For now, I've focused on clocks. You can copy the splitter.py
and joiner.py
files into your mounted volume that stores all videos.
Change into that directory on the container, and first run splitter.py
. It will create a bunch of video snippets under the slices/
directory. From there, you can run joiner.py
to hit the scraper
service, and grab all the unique videos for remixing.
The joiner.py
file will create a videolist.txt
. You can send that to ffmpeg's concat demuxer and generate a new video with the following ffmpeg command:
$ ffmpeg -f concat -i videolist.txt -c copy out.mp4
You'll need to get your nodes with GPUs available within them, and then add a label to have the NVIDIA DCGM exporter run as a DaemonSet on those pods extracting GPU information.
In my case, running microk8s locally, where my attached machine has a GPU available in k8s:
$ microk8s.kubectl get nodes
NAME STATUS ROLES AGE VERSION
stankley Ready <none> 48d v1.13.
$ microk8s.kubectl label nodes stankley hardware-type=NVIDIAGPU
$ microk8s.kubectl apply -f node-exporter-daemonset.yaml
When we've added the DaemonSet, we can check to see if it's running and exporting metrics by curling localhost:
$ curl -s localhost:9100/metrics | grep dcgm
# HELP dcgm_app_clock_violation Total throttling duration (in us).
# TYPE dcgm_app_clock_violation counter
dcgm_app_clock_violation{gpu="0",uuid="GPU-a612132df-7c52-b181-cf39-28065234123ac8"} -9.134403409224488e+18
....
....
With this, we can then extract these metrics with Datadog via annotations and auto-discovery.