Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline webhook controllers fail to become leaders #3529

Closed
afrittoli opened this issue Nov 16, 2020 · 0 comments · Fixed by #3531
Closed

Pipeline webhook controllers fail to become leaders #3529

afrittoli opened this issue Nov 16, 2020 · 0 comments · Fixed by #3531
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@afrittoli
Copy link
Member

Expected Behavior

The webhook controllers become leaders and start to work

Actual Behavior

They attempt to acquire a lease, but fail.
This does not always happen, but I see it a lot in local kind clusters.

I1116 14:51:14.245992       1 leaderelection.go:242] attempting to acquire leader lease  tekton-pipelines/webhook.conversionwebhook.00-of-01...
I1116 14:51:14.246485       1 leaderelection.go:242] attempting to acquire leader lease  tekton-pipelines/webhook.webhookcertificates.00-of-01...
I1116 14:51:14.246931       1 leaderelection.go:242] attempting to acquire leader lease  tekton-pipelines/webhook.defaultingwebhook.00-of-01...
I1116 14:51:14.247191       1 leaderelection.go:242] attempting to acquire leader lease  tekton-pipelines/webhook.validationwebhook.00-of-01...
I1116 14:51:14.247479       1 leaderelection.go:242] attempting to acquire leader lease  tekton-pipelines/webhook.configmapwebhook.00-of-01...
2020/11/16 14:51:22 http: TLS handshake error from 172.18.0.4:41288: server key missing

Additional Info

The issue manifests itself through defaults missing in tekton resources.
This is caused by the webhooks not being active.
The certificate webhook does not provision the certificate, so the logs are also full of

2020/11/16 15:43:34 http: TLS handshake error from 172.18.0.4:18085: server key missing
2020/11/16 15:43:35 http: TLS handshake error from 172.18.0.4:44900: server key missing
2020/11/16 15:43:35 http: TLS handshake error from 172.18.0.4:14526: server key missing
2020/11/16 15:43:36 http: TLS handshake error from 172.18.0.4:11806: server key missing
2020/11/16 15:43:36 http: TLS handshake error from 172.18.0.4:56615: server key missing
2020/11/16 15:43:36 http: TLS handshake error from 172.18.0.4:14155: server key missing
2020/11/16 15:43:36 http: TLS handshake error from 172.18.0.4:46752: server key missing
2020/11/16 15:43:37 http: TLS handshake error from 172.18.0.4:30265: server key missing
2020/11/16 15:43:39 http: TLS handshake error from 172.18.0.4:34698: server key missing
2020/11/16 15:43:39 http: TLS handshake error from 172.18.0.4:42270: server key missing
2020/11/16 15:43:40 http: TLS handshake error from 172.18.0.4:27379: server key missing
2020/11/16 15:43:40 http: TLS handshake error from 172.18.0.4:44972: server key missing
2020/11/16 15:43:40 http: TLS handshake error from 172.18.0.4:18143: server key missing

This is caused by a naming conflict on the leases between pipeline and triggers:

$ k get leases -n tekton-pipelines
controller.github.com-tektoncd-triggers-pkg-reconciler-v1alpha1-eventlistener.reconciler.00-of-01   tekton-triggers-controller-594c47b76-mjn67_595148b8-c0fa-4084-ac36-b38d7e29a795     2d6h
tekton.github.com-tektoncd-pipeline-pkg-reconciler-pipelinerun.reconciler.00-of-01                  tekton-pipelines-controller-5f94d94f74-tvbzs_144bd503-7f9d-4e47-8c61-c0283a276632   2d6h
tekton.github.com-tektoncd-pipeline-pkg-reconciler-taskrun.reconciler.00-of-01                      tekton-pipelines-controller-5f94d94f74-tvbzs_511b61e9-09f6-462c-b6af-c17afea38735   2d6h
webhook.configmapwebhook.00-of-01                                                                   tekton-triggers-webhook-59df6955c8-n75nr_d843325d-498d-44b4-9900-75e7fcc48bf0       2d6h
webhook.conversionwebhook.00-of-01                                                                  tekton-pipelines-webhook-5976695855-c4gm6_cccfd8b7-1b3f-42ac-9796-58193374f20e      2d6h
webhook.defaultingwebhook.00-of-01                                                                  tekton-triggers-webhook-59df6955c8-n75nr_fb802917-af9d-4631-8a67-c83808a60026       2d6h
webhook.validationwebhook.00-of-01                                                                  tekton-triggers-webhook-59df6955c8-n75nr_8ec3c156-c9e9-4493-b6e2-78dbc0161d5c       2d6h
webhook.webhookcertificates.00-of-01                                                                tekton-triggers-webhook-59df6955c8-n75nr_c4daba5e-9512-4e49-9f41-4fecd2e0338a       2d6h

Pipeline and triggers use the same name webhook in their config:

However they use two different secret names:

$ k get secret -n tekton-pipelines | grep certs
triggers-webhook-certs                    Opaque                                3      2d6h
webhook-certs                             Opaque                                0      2d6h

The solution for this would be to use a namespaced name in the webhook config, webhook-pipeline for pipeline and webhook-trigger for triggers.

Also related work on knative side: knative/eventing#4530

Thanks @mattmoor for your help tracking this down!

@afrittoli afrittoli added the kind/bug Categorizes issue or PR as related to a bug. label Nov 16, 2020
@afrittoli afrittoli added this to the Pipelines v0.18 milestone Nov 16, 2020
afrittoli added a commit to afrittoli/pipeline that referenced this issue Nov 16, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

Fixes tektoncd#3529

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
afrittoli added a commit to afrittoli/triggers that referenced this issue Nov 16, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

See tektoncd/pipeline#3529 for more details.

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@gmail.com>
tekton-robot pushed a commit to tektoncd/triggers that referenced this issue Nov 16, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

See tektoncd/pipeline#3529 for more details.

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@gmail.com>
tekton-robot pushed a commit that referenced this issue Nov 17, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

Fixes #3529

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
pritidesai pushed a commit to pritidesai/pipeline that referenced this issue Nov 17, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

Fixes tektoncd#3529

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
(cherry picked from commit 747f4ba)
tekton-robot pushed a commit that referenced this issue Nov 17, 2020
The "webhook" name is too generic and it creates conflicts on leases
when other services (like triggers) that use leader election run in
same namespace but with different configuration.

Fixes #3529

Co-authored-by: Matt Moore <mattmoor@vmware.com>

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
(cherry picked from commit 747f4ba)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant