Node Group Creation/Upgrade Failure: Instances Failed to Join Kubernetes Cluster with labels

## Description

I encountered issues with node group creation and upgrades using the terraform-aws-eks module version v20.31.6. Below are the details:

1. During the initial creation of a node group with the labels configuration, the process took 20 minutes and failed with the error:
`NodeCreationFailure: Instances failed to join the Kubernetes cluster`
Removing the `labels` configuration allowed the node group to be created successfully. I then re-enabled the labels and ran `terraform apply`, which updated the labels successfully.

2. While upgrading a node group with` use_latest_ami_release_version = true`, the process took a long time and failed with the error:
`NodeCreationFailure: Couldn't proceed with upgrade process as new nodes are not joining the node group.`
I suspect this issue might be related to the `labels`configuration, similar to issue 1.

- [x] ✋ I have searched the open/closed issues and my issue is not listed.


## Versions

- Module version: 
     - v20.31.6

- Terraform version: 
    - Terraform v1.5.7 on darwin_arm64

- Provider version(s):

    - hashicorp/aws: 5.84.0

## Reproduction Code [Required]

```plaintext
module "eks" {
  source = "terraform-aws-modules/eks/aws"
  #Develop on git Tags v20.31.6
  version         = "~> 20.31"
  cluster_name    = "eks-cluster-test"
  cluster_version = 1.31
  vpc_id                   = data.aws_vpc.main.id
  subnet_ids               = data.aws_subnet.private_subnets[*].id
  control_plane_subnet_ids = data.aws_subnet.private_eks_subnets[*].id
  cluster_endpoint_public_access       = true
  cluster_endpoint_public_access_cidrs = var.eks_public_access_cidrs
  cluster_service_ipv4_cidr = var.cluster_service_ipv4_cidr
  enable_irsa = true
  cluster_addons = {
    aws-ebs-csi-driver = {
      most_recent = true
    }
    coredns = {
      most_recent       = true
      resolve_conflicts = "OVERWRITE"
      timeouts = {
        create = "25m"
        delete = "10m"
      }
    }
    kube-proxy = {
      most_recent       = true
      resolve_conflicts = "OVERWRITE"
    }
    vpc-cni = {
      most_recent       = true
      before_compute    = true
      resolve_conflicts = "OVERWRITE"

      configuration_values = jsonencode({
        env = {
          AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG = "true"
          ENI_CONFIG_LABEL_DEF               = "topology.kubernetes.io/zone"
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
  }
  cluster_upgrade_policy = {
    support_type = "STANDARD"
  }
  eks_managed_node_groups = {
    default_al2023_ng = {
      name                           = "al2023-df-ng"
      use_latest_ami_release_version = true
      ami_type                       = "AL2023_x86_64_STANDARD"
      instance_types                 = ["c5.large"]
      capacity_type                  = "ON_DEMAND"
      subnet_ids                     = data.aws_subnet.private_subnets[*].id
      disk_size                      = 100
      min_size                       = 1
      max_size                       = 3
      desired_size                   = 2
      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size = 100
            volume_type = "gp3"
            iops        = 3000
            throughput  = 150
            delete_on_termination = true
          }
        }
      }
      cloudinit_pre_nodeadm = [
        {
          content_type = "application/node.eks.aws"
          content      = <<-EOT
            ---
            apiVersion: node.eks.aws/v1alpha1
            kind: NodeConfig
            spec:
              kubelet:
                config:
                  shutdownGracePeriod: 30s
                  featureGates:
                    DisableKubeletCloudCredentialProviders: true
          EOT
        }
      ]
      labels = {
        "karpenter.sh/controller"    = "true"
      }
    }

  }
  enable_cluster_creator_admin_permissions = true
  authentication_mode                      = "API_AND_CONFIG_MAP"
  access_entries = {
    adminteam = {
      principal_arn = iam_role_arn_xxxxx
      policy_associations = {
        admin_policy = {
          policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
          access_scope = {
            namespaces = []
            type       = "cluster"
          }
        }
      }
    }
  }
  node_security_group_tags = merge(var.tags, {
    "karpenter.sh/discovery" = "eks-cluster-test"
  })
  tags = var.tags
}
```


Steps to reproduce the behavior:

1. Create a node group with the provided configuration.
2. Observe failure during the creation process (`NodeCreationFailure` error).
3. Remove the `labels` configuration and reapply Terraform.
4. Re-enable `labels` and apply Terraform again (successfully updates labels).
5. Enable `use_latest_ami_release_version = true`and attempt to upgrade.
6. Observe the upgrade failure with the error: `NodeCreationFailure: Couldn't proceed with upgrade process as new nodes are not joining the node group.`


## Expected behavior

1. Node group creation should succeed with` labels `configured.
2. Node group upgrade should proceed successfully when `use_latest_ami_release_version = true `with labels configured.


## Actual behavior

1. Node group creation fails when` labels` are configured initially.

2. Node group upgrade fails with the error: `NodeCreationFailure: Couldn't proceed with upgrade process as new nodes are not joining the node group.`

## Additional context
1. The issue seems to be resolved temporarily by removing` labels `during the initial creation and reapplying them later. However, the upgrade process remains problematic when `use_latest_ami_release_version = true`.
2. I suspect the issue might be related to AMI updates or the interaction between `use_latest_ami_release_version` and` labels`.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Node Group Creation/Upgrade Failure: Instances Failed to Join Kubernetes Cluster with labels #3283

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Node Group Creation/Upgrade Failure: Instances Failed to Join Kubernetes Cluster with labels #3283

Description

Description

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions