Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where to get minikube v1.32.0-beta0 or instructions to run minikube with gpu suport #17380

Closed
rafariossaa opened this issue Oct 8, 2023 · 19 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@rafariossaa
Copy link

rafariossaa commented Oct 8, 2023

What Happened?

Hi,
I am trying to run minikube with gpu (nvidia) support.
I this doc link it is indicated that I need minikube v1.32.0-beta0, but I can not find it in the releases link, the latest beta version I found was v1.26.0-beta.1.

In the case that v1.32.0-beta0 is not available, could you provide instructions to run minikube with docker driver and enabling gpus ?
I currently can run containers on docker that use gpu:

$ sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Sun Oct  8 19:02:51 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA T1200 La...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   52C    P8     3W /  35W |   1231MiB /  4096MiB |      7%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Thanks forehand.

Attach the log file

--

Operating System

Ubuntu

Driver

Docker

@spowelljr
Copy link
Member

Hi @rafariossaa, we don't have the beta release out yet, but if you want to use the GPU feature now you can download the binary built in the PR.

https://storage.googleapis.com/minikube-builds/17314/minikube-linux-amd64

@spowelljr spowelljr added the kind/support Categorizes issue or PR as a support question. label Oct 9, 2023
@rafariossaa
Copy link
Author

Thanks, I will give it a try and provide feedback.

@rafariossaa
Copy link
Author

Hi,
It worked for me:

$ minikube start --driver docker --container-runtime docker --gpus all --cpus=4 --memory=8GB
😄  minikube v1.31.2 on Ubuntu 23.04
✨  Using the docker driver based on user configuration
📌  Using Docker driver with root privileges
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
💾  Downloading Kubernetes v1.28.2 preload ...
    > preloaded-images-k8s-v18-v1...:  402.65 MiB / 402.65 MiB  100.00% 3.86 Mi
    > gcr.io/k8s-minikube/kicbase...:  446.88 MiB / 446.88 MiB  100.00% 2.72 Mi
🔥  Creating docker container (CPUs=4, Memory=8192MB) ...
❗  Using GPUs with the Docker driver is experimental, if you experience any issues please report them at: https://github.com/kubernetes/minikube/issues/new/choose
🛠   Installing the NVIDIA Container Toolkit...
🐳  Preparing Kubernetes v1.28.2 on Docker 24.0.6 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
    ▪ Using image nvcr.io/nvidia/k8s-device-plugin:v0.14.1
🔎  Verifying Kubernetes components...
🌟  Enabled addons: nvidia-device-plugin, storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default


$ kubectl run nvidiatest -it --image=nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2

$ kubectl logs nvidiatest 
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

@spowelljr
Copy link
Member

Thanks for the feedback, glad to hear that it's working for you! We anticipate releasing the beta on Thursday, I'll close this issue once the release is out

@rafariossaa
Copy link
Author

rafariossaa commented Oct 11, 2023

Hi,
I found an issue, if you want me to open a new issue just tell me.

The issue I found is when restarting minikube:

$ minikube start --driver docker --container-runtime docker --gpus all --cpus=8 --memory=8G
😄  minikube v1.31.2 on Ubuntu 23.04
✨  Using the docker driver based on user configuration
📌  Using Docker driver with root privileges
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🔥  Creating docker container (CPUs=8, Memory=8192MB) ...
❗  Using GPUs with the Docker driver is experimental, if you experience any issues please report them at: https://github.com/kubernetes/minikube/issues/new/choose
🛠   Installing the NVIDIA Container Toolkit...
🐳  Preparing Kubernetes v1.28.2 on Docker 24.0.6 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
    ▪ Using image nvcr.io/nvidia/k8s-device-plugin:v0.14.1
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, nvidia-device-plugin, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default


$ minikube stop
✋  Stopping node "minikube"  ...
🛑  Powering off "minikube" via SSH ...
🛑  1 node stopped.


$ minikube start --driver docker --container-runtime docker --gpus all --cpus=8 --memory=8G
😄  minikube v1.31.2 on Ubuntu 23.04
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🔄  Restarting existing docker container for "minikube" ...
❗  Using GPUs with the Docker driver is experimental, if you experience any issues please report them at: https://github.com/kubernetes/minikube/issues/new/choose
🛠   Installing the NVIDIA Container Toolkit...

❌  Exiting due to RUNTIME_ENABLE: Failed to enable container runtime: failed installing the NVIDIA Container Toolkit: /bin/bash -c "curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg": Process exited with status 2
stdout:

stderr:
gpg: cannot open '/dev/tty': No such device or address
curl: (23) Failed writing body


╭───────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                           │
│    😿  If the above advice does not help, please let us know:                             │
│    👉  https://github.com/kubernetes/minikube/issues/new/choose                           │
│                                                                                           │
│    Please run `minikube logs --file=logs.txt` and attach logs.txt to the GitHub issue.    │
│                                                                                           │
╰───────────────────────────────────────────────────────────────────────────────────────────╯

In the last step, it happens the same if I just run minikube start or if I use the rest of parameters that you can find above.
To make it work I need to delete it and then start it again.

@spowelljr
Copy link
Member

Please make a new issue, I'll experiment with this myself as well.

@rafariossaa
Copy link
Author

I have created #17405

@doker78
Copy link

doker78 commented Oct 22, 2023

@spowelljr its will be nice also to include the link somewhere on documentation page here https://minikube.sigs.k8s.io/docs/tutorials/nvidia/

thanks

@wings2020
Copy link

wings2020 commented Oct 27, 2023

Hi @spowelljr,
can you release the new binary built that solved the #17405 issue? thanks a lot~~

@rafariossaa
Copy link
Author

Hi @wings2020 ,
Sure, I will be glad to check it.
However, I not familiar with the build and release process for minikube and I can not find the link for this new build. Could you provide it to me ? Thanks

@wings2020
Copy link

wings2020 commented Oct 27, 2023

Hi @wings2020 , Sure, I will be glad to check it. However, I not familiar with the build and release process for minikube and I can not find the link for this new build. Could you provide it to me ? Thanks

Hi @rafariossaa ,
I am not familiar with the build and release process for minikube either, so I tagged @spowelljr ...and hope he can provide the new binary built for us.

However, my env is air gap network, not sure about that command:
/bin/bash -c "curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --no-tty --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg"
can be run collectively on "minikube start --driver docker --container-runtime docker --gpus all" in an air gap?
so I want to try the new build..

@kdmkone
Copy link

kdmkone commented Oct 27, 2023

I've been following this topic closely. It would be amazing to test out a new beta to see if it's working :D

@kdmkone
Copy link

kdmkone commented Oct 27, 2023

@wings2020 @rafariossaa I think I found where the newer files are stored. The builds corresponding to #17488 which implement the fix for #17405 can be found with the following prefix:

https://storage.googleapis.com/minikube-builds/17488/minikube-linux-amd64

I noticed that the number seems to match the PR number on GitHub, so I went with that.

@spowelljr
Copy link
Member

Sorry for the delay, the release PR is up, just waiting for the tests to come back, if all is good it will be out today.

#17515

As @kdmkone said above, https://storage.googleapis.com/minikube-builds/17488/minikube-linux-amd64 would be the binary with the fix for the problem outlined in this issue.

As per @wings2020 concern about an air gapped system, the --gpus=all command will not work in the current state of the feature as we're downloading the NVIDIA container toolkit at runtime. However, it shouldn't be a problem to install it ahead of time and just run the command to init it during runtime. It won't make it into the release, but it's an easy PR I could spin up that could make it into the full release (non-beta).

@spowelljr
Copy link
Member

@wings2020 Here's a PR for what I talked about above, I'll link you the binary to try yourself once it's built

#17516

@rafariossaa
Copy link
Author

Thanks you all.
I checked and restart work well with --gpus=all in build 17488.

@spowelljr
Copy link
Member

Since the release is out I'm going to close this issue

@spowelljr
Copy link
Member

@wings2020 You can use the link below and it should work on your air gapped machine

https://storage.googleapis.com/minikube-builds/17516/minikube-linux-amd64

@wings2020
Copy link

@spowelljr thank you very much!! It works for me now :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

5 participants