Optional Create before destroy. Add Launch Template and related features. #31

Nuru · 2020-09-04T06:29:53Z

what

Implement "create before destroy" for zero downtime node group updates. This is optional and off by default, because on first use it will cause any existing node groups created with this module to be destroyed and then replaced, causing the same kind of outage this feature will prevent after it is activated.

Because node groups must have unique names within a cluster, creating a new node group before destroying the old one requires node groups to have random names. This is implemented by adding a 1-word random pet name to the end of the static node group name. Turning this on (or turning it off after it has been on) will cause previously created node groups to be replaced because of the change in name.

Add features previously missing here but present in terraform-aws-eks-workers, to the extent supported by AWS, such as

Set nodes to launch with Kubernetes taints
Specify launch template (not all features supported by AWS, see "Launch template - Prohibited" in AWS documentation)
Specify AMI for nodes
Arbitrary bootstrap.sh options
Arbitrary kubelet options
"Before join" and "after join" scripting

why

Many kinds of node group changes require Terraform to replace the existing node group with a new one. The default Terraform behavior is to delete the old resource before creating the new one, since many resources (such as node group) require unique names, so you cannot create the new resource while the old one exists. However, this results in the node group being completely destroyed, and therefore offline for several minutes, which is usually an unacceptable outage. Now you can avoid this by setting create_before_destroy = true.
Useful features previously unavailable, bring closer to feature parity with terraform-aws-eks-workers.

caveats

When using create before destroy

We cannot automatically detect when the node_group will be destroyed and generate a new name for it. Instead, we have tried to cause a new name to be generated when anything changes that would cause a the node group to be destroyed. This may not be perfect. If the name changes unnecessarily, it will trigger a node group replacement, which should be tolerable. If the name fails to change when it needs to, the Terraform apply will fail with an error about the resource already existing. Please let us know what change we missed so we can update the module. Meanwhile, you can get around this by manually "tainting" the random_pet as explained below.

For a short period of time you will be running 2 node groups.

There may still be service outages related to pods and EBS volumes transferring from the old node group to the new one, though this should generally behave like the cluster rapidly scaling up and rapidly scaling back down. If you have issues with autoscaling, such as running single replicas with minAvailable: 25% (which rounds up to minAvailable: 1), preventing the pod from being drained from a node, you may have issues with node groups being replaced.
Your AWS service quotas need to be large enough to run 2 sets of node groups at the same time. If you do not have enough quota for that, launching the new node group will fail. If the new node group launch fails, you will need to manually taint the random_pet resource because while Terraform tries to replace the tainted new node group, it will try to do so with the same name (and fail) unless you also taint random_pet. Assuming you invoked the module as module "eks_node_group", you would taint random_pet with

terraform taint 'module.eks_node_group.random_pet.cbd[0]'

Using new features

Many of the new features of this module rely on new AWS features, and it is unclear to what extent they actually work.

It appears that it is still not possible to tag the Auto Scaling Group or Launch Template with extra tags for the Kubernetes Cluster Autoscaler.
It appears that it is still not possible to propagate tags to elastic GPUs or spot instance requests.
There may be other issues similarly beyond our control.
There are many new features in this module and it has not be comprehensively tested, so be cautious and test your use cases out on non-critical clusters before moving this into production.

Most of the new features require this module to create a Launch Template, and of course you can now supply your own launch template (referenced by name). There is some overlap between settings that can be made directly on an EKS managed node group and some that can be made in a launch template. This results in settings being allowed in one place and not in the other: these limitations and prohibitions are detailed in the AWS documentation. This module attempts to resolve these differences in many cases, but some limitations remain:

Support for remote access using SSH is not supported when using a launch template created by this module. Correctly configuring the launch template for remote access is tricky because it interferes with automatic configuration of access by the Kubernetes control plane. We do not need it and cannot test it at this time, so we do not support it, but if you need it, you can create your own launch template that has the desired configuration and leave the ec2_ssh_key setting null.
If you supply the Launch Template, this module requires that the Launch Template specify the AMI Image ID to use. This requirement could be relaxed in the future if we find demand for it.
In general, this module assumes you are using an Amazon Linux 2 AMI, and supports selecting the AMI by Kubernetes version or AMI release version. If you are using some other AMI that does not support Amazon's bootstrap.sh, most of the new features will not work. You will need to implement them yourself on your AMI. You can provide arbitrary (Base64 encoded) User Data to your AMI via userdata_override.
No support for spot instances specified by launch template (EKS limitation).
No support for shutdown behavior or "Stop - Hibernate" behavior in launch template (EKS limitation).
No support for IAM instance profile or Subnets via Launch Template (EKS limitation). You can still supply subnets via subnet_ids and the module will apply the via the node group configuration.

references

Many of the new features are made possible by EKS adding support for launch templates.

New AWS support for Launch Templates and custom AMIs in EKS managed node groups: blog post
AWS Docs: EKS Launch template support
Request for Node Groups to support ASG tagging: [EKS] [request]: Nodegroup should support tagging ASGs aws/containers-roadmap#608
Special note from Kubernetes Cluster Autoscaler on labeling GPU instances

Nuru · 2020-09-04T18:50:29Z

/test all

Nuru · 2020-09-04T19:42:52Z

/terraform-fmt

Nuru · 2020-09-04T22:24:49Z

/test all

Nuru · 2020-09-05T00:58:44Z

/test all

Nuru · 2020-09-05T01:29:11Z

/test all

Nuru · 2020-09-05T07:26:06Z

/test all

Nuru · 2020-09-05T08:11:25Z

/test all

osterman · 2020-09-05T17:52:32Z

userdata.tf

+
+  need_userdata = (var.userdata_override == null) && (length(local.userdata_vars.before_cluster_joining_userdata) > 0) || local.need_bootstrap
+
+  userdata = local.need_userdata ? base64encode(templatefile("${path.module}/userdata.tpl", merge(local.userdata_vars, local.cluster_data))) : var.userdata_override


I think we should instead use https://registry.terraform.io/providers/hashicorp/cloudinit/latest/docs/data-sources/cloudinit_config that is designed for things like user_data.

danjbh · 2020-09-07T02:03:40Z

Definitely lots of complexity, but understandably so -- I was able to successfully test and confirm some of the functionality. Looks good to me!

aknysh · 2020-09-07T16:44:54Z

ami.tf

+  // 3. data.eks_cluster.this.kubernetes_version
+  need_cluster_kubernetes_version = local.need_ami_id && length(compact([var.ami_release_version, var.kubernetes_version])) == 0
+
+  ami_kubernetes_version = local.need_ami_id ? (local.need_cluster_kubernetes_version ? data.aws_eks_cluster.this[0].version :


try(data.aws_eks_cluster.this[0].version, "")

otherwise it will fail when enabled=false

aknysh · 2020-09-07T16:48:10Z

main.tf

+  launch_template_version = local.use_launch_template ? (
+    length(local.configured_launch_template_version) > 0 ? local.configured_launch_template_version :
+    (
+      length(local.configured_launch_template_name) > 0 ? data.aws_launch_template.this[0].latest_version : aws_launch_template.default[0].latest_version


length(local.configured_launch_template_name) > 0 ? try(data.aws_launch_template.this[0].latest_version, "") : try(aws_launch_template.default[0].latest_version, "")

aknysh · 2020-09-07T16:49:16Z

main.tf

-    id             = coalesce(var.launch_template_id, aws_launch_template.default[0].id)
-    latest_version = coalesce(var.launch_template_version, aws_launch_template.default[0].latest_version)
-  }
+data "aws_eks_cluster" "this" {


before, the pattern was to name a single resource default.
this would be good as well, we just have two naming conventions now.

I would prefer to use "this" in new modules rather than default, and for backward compatibility and consistency mostly continued to use "default", but here referring to the "default cluster" just seemed confusing and wrong.

aknysh · 2020-09-07T16:53:00Z

userdata.tf

+  }
+
+  cluster_data = {
+    cluster_endpoint           = local.get_cluster_data ? data.aws_eks_cluster.this[0].endpoint : null


try(data.aws_eks_cluster.this[0].endpoint, "")

aknysh · 2020-09-07T17:20:16Z

outputs.tf

 }

 output "eks_node_group_arn" {
  description = "Amazon Resource Name (ARN) of the EKS Node Group"
-  value       = join("", aws_eks_node_group.default.*.arn)
+  value       = join("", aws_eks_node_group.default.*.arn, aws_eks_node_group.cbd.*.arn)


does it work with two lists?
according to the docs, join takes a delimiter and a list

https://www.terraform.io/docs/configuration/functions/join.html

Yes, not documented properly, but in TF 0.13 it is join(string, list...)

> join(", ", ["one"], ["two", "three"], ["blue"]) one, two, three, blue

aknysh · 2020-09-07T17:20:37Z

outputs.tf

 }

 output "eks_node_group_status" {
  description = "Status of the EKS Node Group"
-  value       = join("", aws_eks_node_group.default.*.status)
+  value       = join("", aws_eks_node_group.default.*.status, aws_eks_node_group.cbd.*.status)


does it work with two lists?
according to the docs, join takes a delimiter and a list

https://www.terraform.io/docs/configuration/functions/join.html

Yes, it works

Nuru added 2 commits September 3, 2020 21:23

First do no harm

28c550b

create before delete

9649f0a

Nuru requested review from danjbh, aknysh, goruha, htplbc, johncblandii, maximmi and osterman September 4, 2020 06:30

Nuru changed the title ~~Feature parity with terraform-aws-eks-workers~~ Breaking Change: Create before delete. Add Launch Template and related features. Sep 4, 2020

Nuru changed the title ~~Breaking Change: Create before delete. Add Launch Template and related features.~~ Breaking Change: Create before destroy. Add Launch Template and related features. Sep 4, 2020

Make "create before destroy" optional and off for backward compatability

8336564

Nuru marked this pull request as ready for review September 4, 2020 20:14

Nuru requested review from a team as code owners September 4, 2020 20:14

Nuru changed the title ~~Breaking Change: Create before destroy. Add Launch Template and related features.~~ Optional Create before destroy. Add Launch Template and related features. Sep 4, 2020

Nuru added the terraform/0.13 Module requires Terraform 0.13 or later label Sep 4, 2020

Make "create before destroy" optional and off for backward compatability

8e25e5c

Nuru added 2 commits September 4, 2020 17:01

Pile of fixes

cc9a9a1

EKS requires before-joining userdata to be in MIME format

ad1527c

Clarify that userdata vars are bash commands, not scripts

6b71ff4

Set http_put_response_hop_limit = 2 instead of default 1

93563d6

Nuru added the enhancement New feature or request label Sep 5, 2020

http_endpoint is required when http_put_response_hop_limit is set

2198dcf

osterman reviewed Sep 5, 2020

View reviewed changes

osterman approved these changes Sep 5, 2020

View reviewed changes

danjbh approved these changes Sep 7, 2020

View reviewed changes

Nuru merged commit 7a1248f into master Sep 7, 2020

Nuru deleted the bootstrap branch September 7, 2020 02:15

aknysh reviewed Sep 7, 2020

View reviewed changes

Nuru mentioned this pull request Sep 18, 2020

Remove autoscaler permissions from worker role #34

Merged


		need_userdata = (var.userdata_override == null) && (length(local.userdata_vars.before_cluster_joining_userdata) > 0) \|\| local.need_bootstrap

		userdata = local.need_userdata ? base64encode(templatefile("${path.module}/userdata.tpl", merge(local.userdata_vars, local.cluster_data))) : var.userdata_override

Uh oh!

Optional Create before destroy. Add Launch Template and related features. #31

Optional Create before destroy. Add Launch Template and related features. #31

Uh oh!

Conversation

Nuru commented Sep 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

what

why

caveats

When using create before destroy

Using new features

references

Uh oh!

Nuru commented Sep 4, 2020

Uh oh!

Nuru commented Sep 4, 2020

Uh oh!

Nuru commented Sep 4, 2020

Uh oh!

Nuru commented Sep 5, 2020

Uh oh!

Nuru commented Sep 5, 2020

Uh oh!

Nuru commented Sep 5, 2020

Uh oh!

Nuru commented Sep 5, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danjbh commented Sep 7, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Nuru commented Sep 4, 2020 •

edited

Loading