Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does minikube support AMD GPUs #19463

Open
yx-lamini opened this issue Aug 17, 2024 · 5 comments
Open

Does minikube support AMD GPUs #19463

yx-lamini opened this issue Aug 17, 2024 · 5 comments
Labels
area/gpu GPU related items help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@yx-lamini
Copy link

What Happened?

minikube start --driver docker --container-runtime docker --gpus all

Does not seem work with AMD GPUs. Complains
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]

Attach the log file

N/A

Operating System

Ubuntu

Driver

Docker

@medyagh
Copy link
Member

medyagh commented Aug 20, 2024

@yx-lamini we dont have support tfor amd GPUs but I would be happy to accept a contribution to add it

@medyagh medyagh added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/gpu GPU related items priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Aug 20, 2024
@yx-lamini
Copy link
Author

Could you briefly explain or point to docs/code-files for Minikube's high-level logical architecture for supporting NVIDIA GPUs?

Or provide suggestions on how we could support AMD GPUs in Minikube?

I'd love to contribute. We actually are actively assess the technical investment for making minikube support AMD GPUs.
Now I can already attach AMD GPUs to docker with:

docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/rocm-terminal

I think that's through the container device interface (CDI).

Assuming we build on top of docker's CDI to support AMD GPUs in Minikube, what's the suggested approach we should be taking with Minikube?

Better yet, if minikube's Nvidia support --gpus all is also built on top of CDI, explaining its overall logical architecture, could be helpful to mimic it for AMD GPUs.

@medyagh
Copy link
Member

medyagh commented Aug 21, 2024

minikube uses the docker's --gpus all to attach the gpu to the container, and we also install the nvidia-smi in the base image is the required for it...so I am wondering if we need to install same driver for amd ?

do you have an example of running gpu workload in a nested container ? (inside the docker container)

that would be cool if we can have support for amd as well. and I am assuming you are talking about dedicated AMD gpus, right?

@medyagh
Copy link
Member

medyagh commented Aug 21, 2024

btw here is an example of nvidia workload #19486

@yx-lamini
Copy link
Author

btw here is an example of nvidia workload #19486

Great, I'll take a look
#19345 (comment)

do you have an example of running gpu workload in a nested container ? (inside the docker container)

rocm/pytorch is the one we use.
I haven't tested nested container, will get back to you next week.

dedicated AMD gpus, right?

We use AMD GPUs in a data center cluster setting. GPUs are shared among kubernetes pods
There is no MIG for AMD GPUs.

Does this align with what you mentioned as "dedicated"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gpu GPU related items help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

2 participants