Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA support with docker driver #10229

Open
chrisroat opened this issue Jan 22, 2021 · 9 comments
Open

NVIDIA support with docker driver #10229

chrisroat opened this issue Jan 22, 2021 · 9 comments
Labels
area/gpu GPU related items co/docker-driver Issues related to kubernetes in container help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@chrisroat
Copy link

The doc page on using NVIDIA GPUs covers the none and kvm2 drivers. The none driver is considered advanced, while the kvm2 requires a spare GPU.

Is support for the docker driver feasible/planned?

@afbjorklund
Copy link
Collaborator

It should be possible, as long as the outer privileged container is launched with nvidia-docker: NVIDIA/nvidia-docker#375

But it is not something that would work out-of-the-box. If it was a simple option (like --runtime), it could be considered.

Running on Linux, that is.

Not for the Docker VM...

@afbjorklund afbjorklund added kind/feature Categorizes issue or PR as related to a new feature. co/docker-driver Issues related to kubernetes in container area/gpu GPU related items labels Jan 23, 2021
@priyawadhwa priyawadhwa added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jan 25, 2021
@chrisroat
Copy link
Author

Yes, I was considering running on Linux and I am using nvidia-docker.

I'm not overly familiar with minikube and how to adjust the launch of the "outer privileged container". My use case is a helm chart which I run on GKE w/GPUs, which I want to iterate locally on.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 26, 2021
@sharifelgamal sharifelgamal removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 12, 2021
@ilya-zuyev ilya-zuyev added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jul 14, 2021
@pythonwood
Copy link

nvidia + docker is easy now. install-guide

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Sat Sep 18 04:14:32 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63.01    Driver Version: 470.63.01    CUDA Version: 11.4     |
...

It is so good if I can use gpu in minikube

@sharifelgamal sharifelgamal added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Oct 20, 2021
@sharifelgamal
Copy link
Collaborator

We don't currently have the bandwidth to implement a feature like this, but I would be happy to review a PR that does this.

@d4l3k
Copy link
Contributor

d4l3k commented Feb 26, 2023

I've added the beginnings of support at #15927. It works E2E but has the caveat that you need a custom kicbase image with an exactly matching NVML version. Reviews welcome :)

@anthonyalayo
Copy link

Thanks @d4l3k ! I think a lot of people will appreciate this. What is left for support?

@d4l3k
Copy link
Contributor

d4l3k commented Jun 7, 2023

@anthonyalayo need a maintainer to chime in on approach and then update the PR to better handle the Nvidia dependencies

@anthonyalayo
Copy link

Could a maintainer comment here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gpu GPU related items co/docker-driver Issues related to kubernetes in container help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

10 participants