Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minikube addons enable gcp-auth --refresh hangs forever #14897

Open
henryrior opened this issue Sep 1, 2022 · 13 comments
Open

minikube addons enable gcp-auth --refresh hangs forever #14897

henryrior opened this issue Sep 1, 2022 · 13 comments
Labels
addon/gcp-auth Issues with the GCP Auth addon area/addons kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@henryrior
Copy link

What Happened?

My team uses the gcp-auth addon. minikube addons enable gcp-auth works fine, but when we add the --refresh flag to rotate credentials, it hangs forever. Adding the --alsologtostderr flag shows that it gets to here and then hangs indefinitely:

I0901 16:26:05.539862 1822 out.go:177] ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.10 ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.10 I0901 16:26:05.560484 1822 out.go:177] ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0 ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0 I0901 16:26:05.579669 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-ns.yaml I0901 16:26:05.579684 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-ns.yaml (700 bytes) I0901 16:26:05.594879 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-service.yaml I0901 16:26:05.594895 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-service.yaml (788 bytes) I0901 16:26:05.609378 1822 addons.go:345] installing /etc/kubernetes/addons/gcp-auth-webhook.yaml I0901 16:26:05.609393 1822 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-webhook.yaml (4843 bytes) I0901 16:26:05.622049 1822 ssh_runner.go:195] Run: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.24.3/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml

While we can workaround this by manually deleting the gcp-auth secret with kubectl delete secret gcp-auth and re-running the enable command, this has caused issues in automated scripts.

Attach the log file

logs.txt

Operating System

macOS (Default)

Driver

No response

@klaases
Copy link
Contributor

klaases commented Sep 9, 2022

Hi @henryrior, did this work ok in the past, or is this something that has not yet worked?

Code Reference:

func refreshExistingPods(cc *config.ClusterConfig) error {

@henryrior
Copy link
Author

Hey @klaases , assuming this is the --refresh logic, unfortunately I just started using minikube last month so I can't say if it used to work. I can ask around my colleagues, however the refresh flag is currently hanging for them too.

@spowelljr

This comment was marked as outdated.

@spowelljr spowelljr added kind/support Categorizes issue or PR as a support question. area/addons triage/needs-information Indicates an issue needs more information in order to work on it. addon/gcp-auth Issues with the GCP Auth addon labels Oct 21, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 18, 2023
@bastiankistner
Copy link

bastiankistner commented Feb 28, 2023

This is the output I see:


❯ minikube addons enable gcp-auth --refresh --alsologtostderr


I0228 09:14:57.264549   17806 out.go:296] Setting OutFile to fd 1 ...
I0228 09:14:57.265492   17806 out.go:348] isatty.IsTerminal(1) = true
I0228 09:14:57.265500   17806 out.go:309] Setting ErrFile to fd 2...
I0228 09:14:57.265505   17806 out.go:348] isatty.IsTerminal(2) = true
I0228 09:14:57.266003   17806 root.go:334] Updating PATH: /Users/bastian/.minikube/bin
I0228 09:14:57.266014   17806 oci.go:567] shell is pointing to dockerd inside minikube. will unset to use host
I0228 09:14:57.279324   17806 out.go:177] 💡  gcp-auth is an addon maintained by Google. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
💡  gcp-auth is an addon maintained by Google. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
I0228 09:14:57.286653   17806 config.go:180] Loaded profile config "minikube": Driver=docker, ContainerRuntime=docker, KubernetesVersion=v1.25.3
I0228 09:14:57.287298   17806 addons.go:65] Setting gcp-auth=true in profile "minikube"
I0228 09:14:57.287481   17806 mustload.go:65] Loading cluster: minikube
I0228 09:14:57.287562   17806 config.go:180] Loaded profile config "minikube": Driver=docker, ContainerRuntime=docker, KubernetesVersion=v1.25.3
I0228 09:14:57.289189   17806 cli_runner.go:164] Run: docker container inspect minikube --format={{.State.Status}}
I0228 09:14:57.416840   17806 host.go:66] Checking if "minikube" exists ...
I0228 09:14:57.417350   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "8443/tcp") 0).HostPort}}'" minikube
I0228 09:14:57.492605   17806 ssh_runner.go:362] scp memory --> /var/lib/minikube/google_application_credentials.json (295 bytes)
I0228 09:14:57.492705   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "22/tcp") 0).HostPort}}'" minikube
I0228 09:14:57.540621   17806 sshutil.go:53] new ssh client: &{IP:127.0.0.1 Port:53131 SSHKeyPath:/Users/bastian/.minikube/machines/minikube/id_rsa Username:docker}
I0228 09:14:58.109854   17806 ssh_runner.go:362] scp memory --> /var/lib/minikube/google_cloud_project (11 bytes)
I0228 09:14:58.141936   17806 addons.go:227] Setting addon gcp-auth=true in "minikube"
W0228 09:14:58.141961   17806 addons.go:236] addon gcp-auth should already be in state true
I0228 09:14:58.142372   17806 host.go:66] Checking if "minikube" exists ...
I0228 09:14:58.142670   17806 cli_runner.go:164] Run: docker container inspect minikube --format={{.State.Status}}
I0228 09:14:58.185983   17806 ssh_runner.go:195] Run: cat /var/lib/minikube/google_application_credentials.json
I0228 09:14:58.186045   17806 cli_runner.go:164] Run: docker container inspect -f "'{{(index (index .NetworkSettings.Ports "22/tcp") 0).HostPort}}'" minikube
I0228 09:14:58.233280   17806 sshutil.go:53] new ssh client: &{IP:127.0.0.1 Port:53131 SSHKeyPath:/Users/bastian/.minikube/machines/minikube/id_rsa Username:docker}
I0228 09:14:58.332163   17806 out.go:177]     ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.13
    ▪ Using image gcr.io/k8s-minikube/gcp-auth-webhook:v0.0.13
I0228 09:14:58.342215   17806 out.go:177]     ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0
    ▪ Using image k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0
I0228 09:14:58.347683   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-ns.yaml
I0228 09:14:58.347694   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-ns.yaml (700 bytes)
I0228 09:14:58.365320   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-service.yaml
I0228 09:14:58.365335   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-service.yaml (788 bytes)
I0228 09:14:58.381300   17806 addons.go:419] installing /etc/kubernetes/addons/gcp-auth-webhook.yaml
I0228 09:14:58.381317   17806 ssh_runner.go:362] scp memory --> /etc/kubernetes/addons/gcp-auth-webhook.yaml (5389 bytes)
I0228 09:14:58.396525   17806 ssh_runner.go:195] Run: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.25.3/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml

@spowelljr spowelljr added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 28, 2023
@spowelljr
Copy link
Member

The command that's hanging is setup on retry but the timeout is 2 minutes which is likely much too long. I'm thinking we should bump down the timeout and then maybe this will be resolved on retry.

@spowelljr
Copy link
Member

@bastiankistner Did you create the cluster with an older version of minikube and then you updated the minikube binary since? Also, is this a consistent issue? ie. If you cancel the command and try again does it still fail?

@spowelljr spowelljr self-assigned this Mar 1, 2023
@spowelljr spowelljr removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Mar 1, 2023
@spowelljr
Copy link
Member

This is most likely an infinite retry without timeout in refreshExistingPods, which would explain why we only ever see it --refresh. Will fix this up in a bit.

@spowelljr spowelljr added kind/bug Categorizes issue or PR as related to a bug. and removed kind/support Categorizes issue or PR as a support question. labels Mar 1, 2023
@bastiankistner
Copy link

@bastiankistner Did you create the cluster with an older version of minikube and then you updated the minikube binary since? Also, is this a consistent issue? ie. If you cancel the command and try again does it still fail?

That might indeed be the case. Is it common to have issues when I upgrade the binary after the cluster was created?

My temporary workaround is the following:

kubectl --namespace=${NAMESPACE} create secret docker-registry europe-west1-docker-pkg-dev-pull-secret \
        --docker-server=https://europe-west1-docker.pkg.dev \
        --docker-username=oauth2accesstoken \
        --docker-password="$(gcloud auth print-access-token)" \
        --docker-email=a@b.com \
        --save-config \
        --dry-run=client -o yaml | kubectl apply -f -

But I assume that the GOOGLE_APPLICATION_CREDENTIALS might also expire and therefore having a working solution for both would be great.

It is a consistent issue. The refresh command keeps has never succeeded so far. But what indeed also works is just disabling the addon and re-enabling it. This completes successfully.

@spowelljr
Copy link
Member

That might indeed be the case. Is it common to have issues when I upgrade the binary after the cluster was created?

It looks like if that was the issue you should have seen an apply error which you don't have

apply failed, will retry: sudo KUBECONFIG=/var/lib/minikube/kubeconfig /var/lib/minikube/binaries/v1.21.2/kubectl apply -f /etc/kubernetes/addons/gcp-auth-ns.yaml -f /etc/kubernetes/addons/gcp-auth-service.yaml -f /etc/kubernetes/addons/gcp-auth-webhook.yaml: Process exited with status 1

I'm working on a branch and have the apply failure resolved along with another bug you're not experiencing. I'm then going to add a timeout to the infinite retry and add some logging to see if that's where you're experiencing the issue. I'll give you a link to the binary with the fixes once I have a PR up. What OS do you use so I can provide the correct binary to you?

@bastiankistner
Copy link

bastiankistner commented Mar 15, 2023

I'm running macOS m1 (darwin/arm64)

@spowelljr
Copy link
Member

Here's a binary you can use, make sure to run it with --alsologtostderr as it has improved logging. Let me know the result

https://github.com/kubernetes/minikube/releases/latest/download/minikube-darwin-arm64

@spowelljr spowelljr removed their assignment Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addon/gcp-auth Issues with the GCP Auth addon area/addons kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

6 participants