Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minikube] webhook-server-tls secret not created #3815

Closed
jednymslowem opened this issue May 22, 2020 · 8 comments · Fixed by #3992
Closed

[Minikube] webhook-server-tls secret not created #3815

jednymslowem opened this issue May 22, 2020 · 8 comments · Fixed by #3992
Assignees
Labels
area/backend area/execution_cache kind/bug status/triaged Whether the issue has been explicitly triaged

Comments

@jednymslowem
Copy link

What steps did you take:

webhook-server-tls secret not created during Kubeflow Pipelines standalone deployment

What happened:

Cache server got stuck in ContainerCreating state, because of the missing secret

Environment:

Mac OS, I use minikube cluster bundled with Docker for desktop

How did you deploy Kubeflow Pipelines (KFP)?

export PIPELINE_VERSION=0.5.1
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION

KFP version: 0.5.1

KFP SDK version: not installed

Anything else you would like to add:

It happens after a second deployment (when I deploy for the first time on a fresh cluster it works fine, then when I delete the deployment with:

export PIPELINE_VERSION=0.5.1
kubectl delete -k github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION
kubectl delete -k github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION

and recreate it with the command above, the secret is not being created.

When I run ./deploy-cache-service.sh manually, the secret is created and the cache server is starting as expected.

/kind bug
/area backend

@Ark-kun
Copy link
Contributor

Ark-kun commented May 22, 2020

Can you please check the logs of cache-deployer?

P.S. Are you able to run any pipelines?

@Ark-kun Ark-kun added the status/triaged Whether the issue has been explicitly triaged label May 22, 2020
@Ark-kun Ark-kun changed the title webhook-server-tls secret not created [Minikube] webhook-server-tls secret not created May 22, 2020
@tarjintor
Copy link

tarjintor commented May 26, 2020

I got the same issue,my logs are

ubuntu@t224:~$ kubectl -n kubeflow logs -f cache-deployer-deployment-5f94848fd9-xbjws --tail=100
+ echo 'Start deploying cache service to existing cluster:'
+ NAMESPACE=kubeflow
+ MUTATING_WEBHOOK_CONFIGURATION_NAME=cache-webhook-kubeflow
+ kubectl get mutatingwebhookconfigurations cache-webhook-kubeflow --namespace kubeflow --ignore-not-found
Start deploying cache service to existing cluster:
cache-webhook-kubeflow   2020-05-25T16:23:28Z
+ grep cache-webhook-kubeflow -w
+ echo 'Webhook is already installed. Sleeping forever.'
+ sleep infinity
Webhook is already installed. Sleeping forever.

My k8s cluster already have a prometheus server before I use pipelines
And it's a regular k8s cluster,not minikube

But I can run toturrial pipelines
So I don't need cache server in fact?

@Ark-kun
Copy link
Contributor

Ark-kun commented May 26, 2020

I got the same issue,my logs are

There should be more logs since some service had successfully deployed cache-webhook-kubeflow.

Can you please check the previous pods of that deployment?

If that does not work, you can try deleting the cache-webhook-kubeflow config object and then restarting the cache-deployer-deployment-5f94848fd9-xbjws pod.

@Ark-kun
Copy link
Contributor

Ark-kun commented May 26, 2020

So I don't need cache server in fact?

The caching server works as a standalone service. If it's deployed correctly, it will make your pipelines run much faster by skipping already completed steps. If it does not work, your pipelines just run as usual without skipping any steps.

@tarjintor
Copy link

I delete cache-webhook-kubeflow and it works,thanks

@Ark-kun
Copy link
Contributor

Ark-kun commented May 27, 2020

If this happens again try to collect the cache deployer logs across all restarts, so we can see how it was creating the secret.

@jednymslowem
Copy link
Author

I think I got to the bottom of it. cache-webhook-kubeflow is created all right, but it is not being deleted by kubectl delete -k github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION . So when I redeploy Kubeflow Pipelines to the same cluster again, the old cache-webhook-kubeflow is still there and it is leading to the logs @tarjintor pasted above.

Adding kubectl delete mutatingwebhookconfigurations cache-webhook-kubeflow to my deletion script solved the issue.

@Ark-kun
Copy link
Contributor

Ark-kun commented Jun 16, 2020

So when I redeploy Kubeflow Pipelines to the same cluster again, the old cache-webhook-kubeflow is still there

I wonder why the TLS secret was deleted. Was the namespace deleted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend area/execution_cache kind/bug status/triaged Whether the issue has been explicitly triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants