Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for remote access and lifecycle issues #30

Merged
merged 18 commits into from
Sep 12, 2020
Merged

Fix for remote access and lifecycle issues #30

merged 18 commits into from
Sep 12, 2020

Conversation

woz5999
Copy link
Contributor

@woz5999 woz5999 commented Sep 3, 2020

what

  • Create remote access security group for launch template when necessary to enable SSH access
  • Resolve some lifecycle dependency issues caused by create-before-destroy

why

  • Solves the issue where remote_access is not valid for node groups that specify launch templates. If you specify an ssh key or source security group ids with the current state of the module, it will throw an error and prevent node group creation.
  • Fixes some situations where Terraform would fail to apply a plan due to a dependency cycle

@woz5999 woz5999 requested review from a team as code owners September 3, 2020 18:18
@woz5999 woz5999 requested review from Makeshift and adamcrews and removed request for a team September 3, 2020 18:18
@woz5999
Copy link
Contributor Author

woz5999 commented Sep 3, 2020

@Nuru @danjbh This PR has fixes some module issues that make it unusable for users trying to specify an ssh key and additional remote access security groups and also fixes a regression that breaks govcloud support.

i went with a preference for maintaining existing behavior and backwards compatibility and not modifying module usage, assuming that would be preferable by the maintainers, but I'm not married to anything specific except that the module should work again.

main.tf Outdated
)

security_group_ids = concat(
list(data.aws_eks_cluster.eks_cluster.vpc_config[0].cluster_security_group_id),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to enable communication with the EKS control plane

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always enough? Could there be more than one or some other security group that needs to be added?

main.tf Outdated
@@ -25,6 +25,20 @@ locals {
id = coalesce(var.launch_template_id, aws_launch_template.default[0].id)
latest_version = coalesce(var.launch_template_version, aws_launch_template.default[0].latest_version)
}

eks_node_group_resources = merge(aws_eks_node_group.default.*.resources,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is some formatting to maintain backward compatibility with outputs.

main.tf Outdated
@@ -57,6 +71,14 @@ data "aws_iam_policy_document" "assume_role" {
}
}

data "aws_subnet" "default" {
id = var.subnet_ids[0]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to prevent introducing an additional required variable. happy to modify if that's a more preferred route.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not reference this if it is not needed.

@@ -170,14 +194,6 @@ resource "aws_eks_node_group" "default" {
version = local.launch_template.latest_version
}

dynamic "remote_access" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block is not supported when using launch templates. currently breaks the entire module if ec2_ssh_key or source_security_group_ids is specified

main.tf Outdated
type = "ingress"

cidr_blocks = [
"0.0.0.0/0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this emulates the default behavior of non-launch template node groups that specify a remote_access ssh key but no source security group.

@woz5999 woz5999 marked this pull request as draft September 3, 2020 18:30
@woz5999 woz5999 marked this pull request as ready for review September 3, 2020 20:51
Copy link
Contributor

@Nuru Nuru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this submission.

We added a lot in v0.11.0. In particular, it should solve your tag problem and restore backward compatibility with v0.8.0. v0.9.0 and v0.10.0 should have been marked pre-releases and giving corresponding semver numbers, but we are using a new automated release system that does not support pre-releases, for which I apologize.

We did not address SSH access when creating a launch template in v0.11.0 because a launch template is not needed for v0.8.0 functionality and because it seems like there are a lot of ways for that to go wrong, plus we did not have a use case or a test for it. If you do not care about propagating tags to instances and EBS volumes, then you do not need a launch template and v0.11.0 should do what you need. Otherwise, please review 0.11 and the comments here, rebase and revise the PR, and point us to any documentation you can to assure us that the security group configuration for the launch template is correct for all the various use cases.

main.tf Outdated
)

# Tagging an elastic gpu on create is not yet supported in govcloud
tag_spec = compact(["instance", "volume", join("", data.aws_partition.current.*.partition) == "aws-us-gov" ? "" : "elastic-gpu"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not want to hard code exceptions like this. In v0.11.0 the default is no tagging and users can specify which resources to tag, so this should not longer be an issue for you.

main.tf Outdated
@@ -25,6 +25,22 @@ locals {
id = coalesce(var.launch_template_id, aws_launch_template.default[0].id)
latest_version = coalesce(var.launch_template_version, aws_launch_template.default[0].latest_version)
}

eks_node_group_resources = [for resource in aws_eks_node_group.default.*.resources :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than change the structure of the resources output, add a new output for the new info, if it is really needed, which I am not sure about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sg id is needed as an output for users that might want to add rules to it outside the module. i was appending to resources to match prior behavior. happy to put it as its own output.
it's important to have as an output for users that might want to add additional rules outside the module. if you're ok breaking backward compatibility, this can all be updated and users can just provide an sg id to the module to attach to the launch template and bypass most of the logic in this PR

main.tf Outdated
)

security_group_ids = concat(
list(data.aws_eks_cluster.eks_cluster.vpc_config[0].cluster_security_group_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always enough? Could there be more than one or some other security group that needs to be added?

main.tf Outdated
@@ -57,6 +71,14 @@ data "aws_iam_policy_document" "assume_role" {
}
}

data "aws_subnet" "default" {
id = var.subnet_ids[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not reference this if it is not needed.

main.tf Outdated
Comment on lines 80 to 83
data "aws_eks_cluster" "eks_cluster" {
name = var.cluster_name
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already in v0.11.0 with appropriate conditions, although now you need to add another condition.

main.tf Outdated Show resolved Hide resolved
main.tf Outdated
Comment on lines 161 to 162
key_name = var.ec2_ssh_key
vpc_security_group_ids = local.security_group_ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we get the security group list wrong, I do not want to specify it if we do not have to.

Suggested change
key_name = var.ec2_ssh_key
vpc_security_group_ids = local.security_group_ids
key_name = local.ec2_ssh_key
vpc_security_group_ids = local.ec2_ssh_key != null ? local.security_group_ids : null

main.tf Outdated

dynamic "tag_specifications" {
for_each = ["instance", "volume", "elastic-gpu"]
for_each = local.tag_spec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handled in v0.11.0

Co-authored-by: Nuru <Nuru@users.noreply.github.com>
@Nuru
Copy link
Contributor

Nuru commented Sep 8, 2020

@woz5999 Note that #32 is open and likely to get merged very soon, so maybe wait until that is released to rebase your PR. Please close your PR if you are satisfied with using v0.11.1 when it comes out.

@woz5999 woz5999 marked this pull request as draft September 8, 2020 18:11
@woz5999 woz5999 marked this pull request as ready for review September 9, 2020 19:44
@woz5999
Copy link
Contributor Author

woz5999 commented Sep 9, 2020

@Nuru modified the PR such that it only adds the ability for the users to attach additional security groups to the generated launch template. it also fixes a bug wherein remote_access and launch_template are mutually exclusive on the eks_node_group resource.

@Nuru
Copy link
Contributor

Nuru commented Sep 11, 2020

/test all

@Nuru Nuru added bugfix Change that restores intended behavior terraform/0.13 Module requires Terraform 0.13 or later labels Sep 11, 2020
@Nuru
Copy link
Contributor

Nuru commented Sep 11, 2020

/test all

@Nuru Nuru changed the title Fix for remote access and govcloud Fix for remote access and lifecycle issues Sep 12, 2020
@Nuru Nuru merged commit 863d4ab into cloudposse:master Sep 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Change that restores intended behavior terraform/0.13 Module requires Terraform 0.13 or later
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants