-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More than one defaulting webhook controller won't work in the same binary with HA enabled #1979
Comments
Maybe the simplest way to fix is to use the name of MutatingWebhookConfiguration for the names of reconcile queue and the lease as well, which is a parameter of defaulting.NewAdmissionController. Same for validation and configmaps webhook controllers. cc @mattmoor |
Don't do that. 😜 In mink I'm running a defaulting webhook for all of serving, eventing, and tekton as a single endpoint (I used to run a lot more!). Pass all the types to a single defaulting webhook. Is there a legitimate reason to be running multiple defaulting webhooks vs. a single with all the types...? |
I don't see one. Currently in our downstream branch we just copy both the MutatingWebhookConfiguration in serving repo and net-istio repo. I guess it is for easy sync with upstream. If using single defaulting webhook is the right way to go, I'll do that. Thanks. |
Yeah, that's the way things were intended to work: https://github.com/mattmoor/mink/blob/f6f4dc092ce14818642cacf87ce26ac892d5a392/cmd/webhook/defaulting.go#L57 LMK if you need any pointers / help. |
It turned out that https://github.com/knative-sandbox/net-istio/blob/master/config/500-mutating-webhook.yaml has a matchExpression while https://github.com/knative/serving/blob/master/config/core/webhooks/defaulting.yaml doesn't. Can't simply merge them. |
Is 'the webhook may not be running on cluster but needs to negotiate with other clusters for ownership' a good reason? In an advanced case where the user does not want their on-cluster control plane availability to be tied to their own data plane, I could see having multiple webhooks (some with different failure domains) that would all need to negotiate ~independently. I could see this popping up if a binary was responsible for multiple clusters and different sets of resources in each of them |
Talked with @mattmoor in slack, there are a few solutions:
|
General Observations/Context
Comments on the Solutions:
I interpret this as there should only be a single registered webhook that uses
I'm not a fan of this - is the goal to avoid deploying two webhook?
I talked about roughly this here: #1140 (comment) This could work with multiple webhook registrations per resource. With Markus's change certain selectors won't be reconciled away. Though, to my third point, should we continue to have this inconsistency between composing webhooks and controllers? Right now we compose webhooks like Matt mentioned here: #1979 (comment) while controllers are done by passing their constructor. There are some differences though:
For 3) from: #1989 (comment)
We've accidentally changed leases names before and leaked them - I think some thrashing of controllers would occur while the rollout happens. I haven't tested it.
Not sure what you mean - these admission versions are just the webhook envelope between the k8s api server <=> webhook. They're currently v1 & v1beta1 |
This issue is stale because it has been open for 90 days with no |
Expected Behavior
I can set more than one defaulting webhook controller in the same binary with leader election (HA) enabled. Same for configmaps and validation webhook.
Actual Behavior
If more than one defaulting webhook controllers are set in the same binary with leader election (HA) enabled, only one of the controllers would reconcile as expected.
Steps to Reproduce the Problem
defaulting.NewAdmissionController
for Istio MutatingWebhookConfigurationAdditional Info
This is because for defaulting webhook controllers, they share the same lease. The lease name is
defaultingwebhook
: https://github.com/knative/pkg/blob/master/webhook/resourcesemantics/defaulting/controller.go#L77. However when setting up leader election, each controller uses different resource lock ID: https://github.com/knative/pkg/blob/master/leaderelection/context.go#L114. So only one of them will actually got set up successfully.In Knative serving autoscaler, KPA controller and metrics controller share the same lease as well. But they have the same resource lock ID: https://github.com/knative/serving/blob/master/cmd/autoscaler/main.go#L269. The callbacks of those two controllers will be triggered when starting leading the lease.
The text was updated successfully, but these errors were encountered: