Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for custom AMIs and node taints and other kubelet args #29

Closed
wants to merge 1 commit into from

Conversation

xario
Copy link

@xario xario commented Aug 31, 2020

What

These changes are in relation to the PR about launch configurations (#27). This PR is circumventing the problem with not being able to specify the KUBELET_EXTRA_ARGS in user_data when not specifying ami_id, as noted in the PR which rolled back some of the changes (#28).
As AWS appends a call to the /etc/eks/bootstrap.sh to the user_data, it is necessary to modify the bootstrap file to include the changes to the config. This is done by adding a hook right before the services are started, which allows for updates to the /etc/systemd/system/kubelet.service.d/30-kubelet-extra-args.conf file.

Here is an outline of what is happening the hook script:

  • Read out the existing configuration from 30-kubelet-extra-args.conf
  • Parse the arguments in KUBELET_EXTRA_ARGS into an associative array
  • Parse the arguments from the user supplied KUBELET_NEW_ARGS into the array
  • Add taints from the user supplied NEW_TAINTS to the entry --register-with-taints in the array
  • Add labels from the user supplied NEW_LABELS to the entry --node-labels in the array
  • Combine the entries from the array into a string, and write it back to 30-kubelet-extra-args.conf

This preserve the arguments that the bootstrap.sh adds to KUBELET_EXTRA_ARGS, while still being able to add user specified arguments.

A minor change is that I had to make to the way the security groups are added to a node group, as I got an error Remote access configuration cannot be specified with a launch template. They are now added to the launch configuration by using vpc_security_group_ids. By specifying this, it overwrites the security group AWS adds normally, that allows for communication with the control plane, which is why I had to add a data source for the aws_eks_cluster in order to retrieve the security group id. This change can be omitted in this PR and added as a separate PR if you like.

Why

  • This is needed in order to use managed node groups with node taints, without having to specifying an ami_id.

References

PS: I have not updated tests and documentation yet, as I wanted to get your reaction to the changes before continuing.
The changes have been tested with and without specifying an ami_id, and with and without specifying an ec2_ssh_key by creating clusters using Terraform in AWS.

@xario xario requested review from a team as code owners August 31, 2020 20:18
@xario xario requested review from jhosteny and Gowiem and removed request for a team August 31, 2020 20:18
Copy link
Contributor

@Nuru Nuru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this review requests changes, I am not actually requesting that you make these changes. I am just posting this to explain why we are not going to accept this PR. We are already working on supporting these features and will have an update soon.

Comment on lines +25 to +27
kubelet_extra_args = replace(join(" ", var.kubelet_extra_args), "'", ""),
node_taints = replace(join(",", var.node_taints), "'", ""),
node_labels = replace(join(",", local.node_labels),"'", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like the idea of blindly stripping quotes out of values. What if they have backslash escapes, for example?

Comment on lines +162 to +166
vpc_security_group_ids = concat(
[data.aws_eks_cluster.eks_cluster.vpc_config[0].cluster_security_group_id],
var.source_security_group_ids,
aws_security_group.remote_access_security_group.*.id
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Comment on lines +171 to +173
format("%s%s",
templatefile("${path.module}/userdata.tpl", local.userdata_vars),
(var.ami_id != null ? templatefile("${path.module}/userdata_ami.tpl", local.userdata_ami_vars) : "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this 2 files concatenated rather than 1 file?

Comment on lines +182 to 194
resource "aws_security_group" "remote_access_security_group" {
count = var.ec2_ssh_key != null && length(var.source_security_group_ids) == 0 ? 1 : 0
name = "eks-${var.cluster_name}-${module.label.id}-remote-access"
description = "Security group for all nodes in the nodeGroup to allow SSH access"
vpc_id = data.aws_eks_cluster.eks_cluster.vpc_config[0].vpc_id

ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of this over the previous dynamic remote access block in aws_eks_node_group?

# Get the last line and cut out the part between the single quotes
# Example output:
# KUBELET_EXTRA_ARGS=--register-with-taints="dedicated=test:NoSchedule" --node-labels=eks.amazonaws.com/nodegroup=test-mng,other.label=other
ENVIRONMENT=$(tail -n1 /etc/systemd/system/kubelet.service.d/30-kubelet-extra-args.conf | cut -d\' -f2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is brittle, in that it requires that the last line of the file be desired line and will break if/when the default changes.


# Execute the hook script before starting the services
sed -i "/^systemctl daemon-reload\$/i . /etc/eks/extra_args.sh" /etc/eks/bootstrap.sh
--//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I applaud the effort to be comprehensive, this still seems to me a lot more complicated and brittle than it needs to be. I do not want to be supporting this a year from now.

Comment on lines -5 to +6
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
Content-Type: text/x-shellscript
Content-Type: charset="us-ascii"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK you are not allowed to have multiple Content-Type headers. For sure I see no advantage to it.


${before_cluster_joining_userdata}

--//--
cat << 'EOF' > /etc/eks/extra_args.sh
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While arguably #!/usr/bin/env bash is more portable than #!/bin/bash it is less secure and if the AMI you are using does not have bash installed as /bin/bash then what are the chances that is it also using /etc/systemd/system/kubelet.service.d/30-kubelet-extra-args.conf? Not great. So I would leave this as #!/bin/bash

@Nuru
Copy link
Contributor

Nuru commented Aug 31, 2020

@xario Thank you for your contribution. We already have in the works something similar, but more along the lines of our other modules. I think your version is more complex and sophisticated than we need or want to support long term, so I am going to close this PR. Expect support for labels and taints in less than a week from now.

@Nuru Nuru closed this Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants