-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Hello, author of nvidia-docker here.
As many of you may know, we recently released our NVIDIA Docker project in our effort to enable GPUs in containerized applications (mostly Deep Learning). This project is currently comprised of two parts:
- A Docker volume plugin to mount NVIDIA driver files inside containers at runtime.
- A small wrapper around Docker to ease the deployment of our images.
More information on this here
While it has been working great so far, now that Docker 1.12 is coming out with configurable runtime and complete OCI support, we would like to move away from this approach (which is admittedly hacky) and work on something which is better integrated with Docker.
The way I see it would be to provide a prestart OCI hook which would effectively trigger our implementation and configure the cgroups/namespaces correctly.
However, there are several things we need to solve first, specifically:
- How to detect if a given image needs GPU support
Currently, we are using a special labelcom.nvidia.volumes.needed, but it is not exported as an OCI annotation (see Clarify how OCI configuration ("config.json") will be handled #21324) - How to pass down to the hook which GPU should be isolated
Currently, we are using an environment variableNV_GPU - How to check whether the image is compatible with the current driver or not.
Currently, we are using a special labelXXX_VERSION
All of the above could be solved using environment variables but I'm not particularly fond of this idea (e.g. docker run -e NVIDIA_GPU=0,1 nvidia/cuda)
So is there a way to pass runtime/hook parameters from the docker command line and if not would it be worth it? (e.g. --runtime-opt)