Skip to content

HA cluster will not join second control-plane node when aws-encryption-provider is running #3019

@scottdhowell3

Description

@scottdhowell3

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

  1. Add aws-encryption-config.yaml static pod as a base64 encoded file to KubeAdmConfigSpec
  2. Add encryption-config.yaml as a base64 encoded file to KubeAdmConfigSpec
  3. Create KMS key in AWS
  4. Spin up 3 control-plane and 1 worker node cluster using the cluster-api-aws-provider
  5. First control-plane node comes up correctly and is be able to be seen with kubectl
  6. Worker node joins the initial control-plane node in the cluster
  7. Second control-plane node errors on kubelet with this error
May 05 15:34:12 ip-10-90-20-215.ec2.internal systemd[1]: Starting kubelet: The Kubernetes Node Agent...
May 05 15:34:12 ip-10-90-20-215.ec2.internal kubelet[4583]: F0505 15:34:12.917439    4583 server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubele
May 05 15:34:12 ip-10-90-20-215.ec2.internal systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
May 05 15:34:12 ip-10-90-20-215.ec2.internal systemd[1]: Unit kubelet.service entered failed state.
May 05 15:34:12 ip-10-90-20-215.ec2.internal systemd[1]: kubelet.service failed.
May 05 15:34:23 ip-10-90-20-215.ec2.internal systemd[1]: kubelet.service holdoff time over, scheduling restart.
May 05 15:34:23 ip-10-90-20-215.ec2.internal systemd[1]: Started kubelet: The Kubernetes Node Agent.
May 05 15:34:23 ip-10-90-20-215.ec2.internal systemd[1]: Starting kubelet: The Kubernetes Node Agent...

What did you expect to happen:
We expected the second control-plane node to join the cluster along with the third one.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
encryption-config.yaml for api-server

apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
    - secrets
    providers:
    - kms:
        name: aws-encryption-provider
        endpoint: unix:///var/run/kmsplugin/socket.sock
        cachesize: 1000
        timeout: 3s
    - identity: {}
aws-encryption-provider.yaml
apiVersion: v1
kind: Pod
metadata:
  name: aws-encryption-provider
  namespace: kube-system
spec:
  containers:
  - image: payitadmin/aws-encryption-provider:latest
    name: aws-encryption-provider
    command:
    - /aws-encryption-provider
    - --key=arn:aws:kms:<account_specific_arn>
    - --region=us-east-1
    - --listen=/var/run/kmsplugin/socket.sock
    - --health-port=:8083
    ports:
    - containerPort: 8083
      protocol: TCP
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8083
    volumeMounts:
    - mountPath: /var/run/kmsplugin
      name: var-run-kmsplugin
  hostNetwork: true
  volumes:
  - name: var-run-kmsplugin
    hostPath:
      path: /var/run/kmsplugin
      type: DirectoryOrCreate

Environment:

  • Cluster-api-provider-aws version: 0.3.3
  • Kubernetes version: (use kubectl version): 1.17.5
  • OS (e.g. from /etc/os-release): Amazon Linux 2

Metadata

Metadata

Labels

area/bootstrapIssues or PRs related to bootstrap providerskind/bugCategorizes issue or PR as related to a bug.lifecycle/activeIndicates that an issue or PR is actively being worked on by a contributor.priority/awaiting-more-evidenceLowest priority. Possibly useful, but not yet enough support to actually get it done.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions