Skip to content

Commit

Permalink
[operator] Manage Virtual Garden kube-controller-manager Deployment (
Browse files Browse the repository at this point in the history
…gardener#7931)

* Extend `operatorv1alpha1.VirtualCluster` with API for kube-controller-manager config

* Defaulting

* [make generate]

Without the `allowDangerousTypes=true` option, `controller-gen` yields the following error:

```
/go/src/github.com/gardener/gardener/pkg/apis/core/v1beta1/types_shoot.go:928:13: found float, the usage of which is highly discouraged, as support for them varies across languages. Please consider serializing your float as string instead. If you are really sure you want to use them, re-run with crd:allowDangerousTypes=true
```

* Adapt documentation

* Make component instantiation reusable

* Deploy/destroy virtual cluster `kubecontrollermanager` component

* Implement `Destroy` of `kubecontrollermanager` component

Otherwise, sane deletion of `Garden` will not work :)

* Adapt integration/e2e tests

* Adapt garden credentials rotation e2e test

* Address PR review feedback
  • Loading branch information
rfranzke authored May 16, 2023
1 parent 80d69e1 commit a57427e
Show file tree
Hide file tree
Showing 45 changed files with 913 additions and 203 deletions.
72 changes: 72 additions & 0 deletions charts/gardener/operator/templates/customresouredefintion.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -765,6 +765,78 @@ spec:
type: array
type: object
type: object
kubeControllerManager:
description: KubeControllerManager contains configuration
settings for the kube-controller-manager.
properties:
certificateSigningDuration:
default: 48h
description: CertificateSigningDuration is the maximum
length of duration signed certificates will be given.
Individual CSRs may request shorter certs by setting
`spec.expirationSeconds`.
pattern: ^([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
type: string
featureGates:
additionalProperties:
type: boolean
description: FeatureGates contains information about enabled
feature gates.
type: object
horizontalPodAutoscaler:
description: HorizontalPodAutoscalerConfig contains horizontal
pod autoscaler configuration settings for the kube-controller-manager.
properties:
cpuInitializationPeriod:
description: The period after which a ready pod transition
is considered to be the first.
type: string
downscaleStabilization:
description: The configurable window at which the
controller will choose the highest recommendation
for autoscaling.
type: string
initialReadinessDelay:
description: The configurable period at which the
horizontal pod autoscaler considers a Pod “not yet
ready” given that it’s unready and it has transitioned
to unready during that time.
type: string
syncPeriod:
description: The period for syncing the number of
pods in horizontal pod autoscaler.
type: string
tolerance:
description: The minimum change (from 1.0) in the
desired-to-actual metrics ratio for the horizontal
pod autoscaler to consider scaling.
type: number
type: object
nodeCIDRMaskSize:
description: NodeCIDRMaskSize defines the mask size for
node cidr in cluster (default is 24). This field is
immutable.
format: int32
type: integer
nodeMonitorGracePeriod:
description: NodeMonitorGracePeriod defines the grace
period before an unresponsive node is marked unhealthy.
type: string
podEvictionTimeout:
description: "PodEvictionTimeout defines the grace period
for deleting pods on failed nodes. Defaults to 2m. \n
Deprecated: The corresponding kube-controller-manager
flag `--pod-eviction-timeout` is deprecated in favor
of the kube-apiserver flags `--default-not-ready-toleration-seconds`
and `--default-unreachable-toleration-seconds`. The
`--pod-eviction-timeout` flag does not have effect when
the taint besed eviction is enabled. The taint based
eviction is beta (enabled by default) since Kubernetes
1.13 and GA since Kubernetes 1.18. Hence, instead of
setting this field, set the `spec.kubernetes.kubeAPIServer.defaultNotReadyTolerationSeconds`
and `spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds`."
type: string
type: object
version:
description: Version is the semantic Kubernetes version to
use for the virtual garden cluster.
Expand Down
1 change: 1 addition & 0 deletions charts/gardener/operator/templates/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ rules:
- virtual-garden-etcd-events
- virtual-garden-etcd-main
- virtual-garden-kube-apiserver
- virtual-garden-kube-controller-manager
verbs:
- delete
- patch
Expand Down
63 changes: 63 additions & 0 deletions docs/api-reference/operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -943,6 +943,55 @@ SNI
</tr>
</tbody>
</table>
<h3 id="operator.gardener.cloud/v1alpha1.KubeControllerManagerConfig">KubeControllerManagerConfig
</h3>
<p>
(<em>Appears on:</em>
<a href="#operator.gardener.cloud/v1alpha1.Kubernetes">Kubernetes</a>)
</p>
<p>
<p>KubeControllerManagerConfig contains configuration settings for the kube-controller-manager.</p>
</p>
<table>
<thead>
<tr>
<th>Field</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<code>KubeControllerManagerConfig</code></br>
<em>
github.com/gardener/gardener/pkg/apis/core/v1beta1.KubeControllerManagerConfig
</em>
</td>
<td>
<p>
(Members of <code>KubeControllerManagerConfig</code> are embedded into this type.)
</p>
<em>(Optional)</em>
<p>KubeControllerManagerConfig contains all configuration values not specific to the virtual garden cluster.</p>
</td>
</tr>
<tr>
<td>
<code>certificateSigningDuration</code></br>
<em>
<a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#duration-v1-meta">
Kubernetes meta/v1.Duration
</a>
</em>
</td>
<td>
<em>(Optional)</em>
<p>CertificateSigningDuration is the maximum length of duration signed certificates will be given. Individual CSRs
may request shorter certs by setting <code>spec.expirationSeconds</code>.</p>
</td>
</tr>
</tbody>
</table>
<h3 id="operator.gardener.cloud/v1alpha1.Kubernetes">Kubernetes
</h3>
<p>
Expand Down Expand Up @@ -977,6 +1026,20 @@ KubeAPIServerConfig
</tr>
<tr>
<td>
<code>kubeControllerManager</code></br>
<em>
<a href="#operator.gardener.cloud/v1alpha1.KubeControllerManagerConfig">
KubeControllerManagerConfig
</a>
</em>
</td>
<td>
<em>(Optional)</em>
<p>KubeControllerManager contains configuration settings for the kube-controller-manager.</p>
</td>
</tr>
<tr>
<td>
<code>version</code></br>
<em>
string
Expand Down
25 changes: 15 additions & 10 deletions docs/concepts/operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,11 +89,7 @@ Please refer to [this document](../usage/shoot_credentials_rotation.md#gardener-
- ETCD encryption key
- `ServiceAccount` token signing key

⚠️ Since `kube-controller-manager` is not yet deployed by `gardener-operator`, rotation of static `ServiceAccount` secrets is not supported and must be performed manually after the `Garden` has reached `Prepared` phase before completing the rotation.

⚠️ Rotation of the static kubeconfig (which is enabled unconditionally) is not support for now.
The reason is that it such static kubeconfig will be disabled without configuration option in the near future.
Instead, we'll implement an approach similar to the [`adminkubeconfig` subresource on `Shoot`s](../usage/shoot_access.md#shootsadminkubeconfig-subresource) which can be used to retrieve a temporary kubeconfig for the virtual garden cluster.
⚠️ Rotation of static `ServiceAccount` secrets is not supported since the `kube-controller-manager` does not enable the `serviceaccount-token` controller.

## Local Development

Expand All @@ -110,8 +106,7 @@ This command sets up a new KinD cluster named `gardener-local` and stores the ku
> It might be helpful to copy this file to `$HOME/.kube/config`, since you will need to target this KinD cluster multiple times.
Alternatively, make sure to set your `KUBECONFIG` environment variable to `./example/gardener-local/kind/operator/kubeconfig` for all future steps via `export KUBECONFIG=example/gardener-local/kind/operator/kubeconfig`.

All of the following steps assume that you are using this kubeconfig.

All the following steps assume that you are using this kubeconfig.

### Setting Up Gardener Operator

Expand Down Expand Up @@ -177,10 +172,12 @@ EOF
To access the virtual garden, you can acquire a `kubeconfig` by

```shell
kubectl -n garden get secret -l name=user-kubeconfig -o jsonpath={..data.kubeconfig} | base64 -d > /tmp/virtual-garden-kubeconfig
kubectl -n garden get secret gardener -o jsonpath={.data.kubeconfig} | base64 -d > /tmp/virtual-garden-kubeconfig
kubectl --kubeconfig /tmp/virtual-garden-kubeconfig get namespaces
```

Note that this kubeconfig uses a token that has validity of `12h` only, hence it might expire and causing you to re-download the kubeconfig.

### Deleting the `Garden`

```shell
Expand Down Expand Up @@ -237,6 +234,8 @@ The virtual garden control plane components are:
- `virtual-garden-etcd-main`
- `virtual-garden-etcd-events`
- `virtual-garden-kube-apiserver`
- `virtual-garden-kube-controller-manager`
- `virtual-garden-gardener-resource-manager`

If the `.spec.virtualCluster.controlPlane.highAvailability={}` is set then these components will be deployed in a "highly available" mode.
For ETCD, this means that there will be 3 replicas each.
Expand All @@ -246,10 +245,16 @@ The `gardener-resource-manager`'s [HighAvailabilityConfig webhook](resource-mana
> If once set, removing `.spec.virtualCluster.controlPlane.highAvailability` again is not supported.
The `virtual-garden-kube-apiserver` `Deployment` is exposed via a `Service` of type `LoadBalancer` with the same name.
In the future, we might switch to exposing it via Istio, similar to how the `kube-apiservers` of shoot clusters are exposed.
In the future, we will switch to exposing it via Istio, similar to how the `kube-apiservers` of shoot clusters are exposed.

Similar to the `Shoot` API, the version of the virtual garden cluster is controlled via `.spec.virtualCluster.kubernetes.version`.
Likewise, specific configuration for the control plane components can be provided in the same section, e.g. via `.spec.virtualCluster.kubernetes.kubeAPIServer` for the `kube-apiserver`.
Likewise, specific configuration for the control plane components can be provided in the same section, e.g. via `.spec.virtualCluster.kubernetes.kubeAPIServer` for the `kube-apiserver` or `.spec.virtualCluster.kubernetes.kubeControllerManager` for the `kube-controller-manager`.

The `kube-controller-manager` only runs a very few controllers that are necessary in the scenario of the virtual garden.
Most prominently, **the `serviceaccount-token` controller is unconditionally disabled**.
Hence, the usage of static `ServiceAccount` secrets is not supported generally.
Instead, the [`TokenRequest` API](https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/) should be used.
Third-party components that need to communicate with the virtual cluster can leverage the [`gardener-resource-manager`'s `TokenRequestor` controller](resource-manager.md#tokenrequestor-controller) and the generic kubeconfig, just like it works for `Shoot`s.

For the virtual cluster, it is essential to provide a DNS domain via `.spec.virtualCluster.dns.domain`.
**The respective DNS record is not managed by `gardener-operator` and should be manually created and pointed to the load balancer IP of the `virtual-garden-kube-apiserver` `Service`.**
Expand Down
4 changes: 2 additions & 2 deletions docs/development/priority-classes.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ When using the `gardener-operator` for managing the garden runtime and virtual c
|---------------------------------- |-----------|-------------------------------------------------------------------------------------------|
| `gardener-garden-system-critical` | 999999550 | `gardener-operator`, `gardener-resource-manager`, `istio` |
| `gardener-garden-system-500` | 999999500 | `virtual-garden-etcd-events`, `virtual-garden-etcd-main`, `virtual-garden-kube-apiserver` |
| `gardener-garden-system-400` | 999999400 | |
| `gardener-garden-system-300` | 999999300 | `vpa-admission-controller`, `etcd-druid` |
| `gardener-garden-system-400` | 999999400 | `virtual-garden-gardener-resource-manager` |
| `gardener-garden-system-300` | 999999300 | `virtual-garden-kube-controller-manager`, `vpa-admission-controller`, `etcd-druid` |
| `gardener-garden-system-200` | 999999200 | `vpa-recommender`, `vpa-updater`, `hvpa-controller` |
| `gardener-garden-system-100` | 999999100 | `kube-state-metrics` |

Expand Down
72 changes: 72 additions & 0 deletions example/operator/10-crd-operator.gardener.cloud_gardens.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -765,6 +765,78 @@ spec:
type: array
type: object
type: object
kubeControllerManager:
description: KubeControllerManager contains configuration
settings for the kube-controller-manager.
properties:
certificateSigningDuration:
default: 48h
description: CertificateSigningDuration is the maximum
length of duration signed certificates will be given.
Individual CSRs may request shorter certs by setting
`spec.expirationSeconds`.
pattern: ^([0-9]+(\.[0-9]+)?(ns|us|µs|ms|s|m|h))+$
type: string
featureGates:
additionalProperties:
type: boolean
description: FeatureGates contains information about enabled
feature gates.
type: object
horizontalPodAutoscaler:
description: HorizontalPodAutoscalerConfig contains horizontal
pod autoscaler configuration settings for the kube-controller-manager.
properties:
cpuInitializationPeriod:
description: The period after which a ready pod transition
is considered to be the first.
type: string
downscaleStabilization:
description: The configurable window at which the
controller will choose the highest recommendation
for autoscaling.
type: string
initialReadinessDelay:
description: The configurable period at which the
horizontal pod autoscaler considers a Pod “not yet
ready” given that it’s unready and it has transitioned
to unready during that time.
type: string
syncPeriod:
description: The period for syncing the number of
pods in horizontal pod autoscaler.
type: string
tolerance:
description: The minimum change (from 1.0) in the
desired-to-actual metrics ratio for the horizontal
pod autoscaler to consider scaling.
type: number
type: object
nodeCIDRMaskSize:
description: NodeCIDRMaskSize defines the mask size for
node cidr in cluster (default is 24). This field is
immutable.
format: int32
type: integer
nodeMonitorGracePeriod:
description: NodeMonitorGracePeriod defines the grace
period before an unresponsive node is marked unhealthy.
type: string
podEvictionTimeout:
description: "PodEvictionTimeout defines the grace period
for deleting pods on failed nodes. Defaults to 2m. \n
Deprecated: The corresponding kube-controller-manager
flag `--pod-eviction-timeout` is deprecated in favor
of the kube-apiserver flags `--default-not-ready-toleration-seconds`
and `--default-unreachable-toleration-seconds`. The
`--pod-eviction-timeout` flag does not have effect when
the taint besed eviction is enabled. The taint based
eviction is beta (enabled by default) since Kubernetes
1.13 and GA since Kubernetes 1.18. Hence, instead of
setting this field, set the `spec.kubernetes.kubeAPIServer.defaultNotReadyTolerationSeconds`
and `spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds`."
type: string
type: object
version:
description: Version is the semantic Kubernetes version to
use for the virtual garden cluster.
Expand Down
4 changes: 4 additions & 0 deletions example/operator/20-garden.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,10 @@ spec:
# resourcesToStoreInETCDEvents:
# - group: networking.k8s.io
# resources: networkpolicies
# kubeControllerManager:
# featureGates:
# SomeKubernetesFeature: true
# certificateSigningDuration: 48h
maintenance:
timeWindow:
begin: 220000+0100
Expand Down
2 changes: 1 addition & 1 deletion example/operator/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.

//go:generate ../../hack/generate-crds.sh 10-crd- operator.gardener.cloud
//go:generate ../../hack/generate-crds.sh 10-crd- -allow-dangerous-types operator.gardener.cloud
//go:generate cp 10-crd-operator.gardener.cloud_gardens.yaml ../../charts/gardener/operator/templates/customresouredefintion.yaml

// Package operator contains example manifests for working on operator.
Expand Down
Loading

0 comments on commit a57427e

Please sign in to comment.