Skip to content

Commit

Permalink
fixed recommendations
Browse files Browse the repository at this point in the history
  • Loading branch information
realvz committed Jun 18, 2020
1 parent f6a5b7f commit 122fcb9
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 21 deletions.
5 changes: 2 additions & 3 deletions content/reliability/docs/controlplane.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,11 +94,11 @@ New Kubernetes versions introduce significant changes and you cannot downgrade a
- EKS control plane upgrade doesn’t include upgrading worker nodes. You are responsible for updating EKS worker nodes. Consider using [EKS managed node groups](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) to automate the process of upgrading worker nodes.
- If required, you can use `kubectl convert` to [convert Kubernetes manifests files between different API versions](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#convert).

## Control Plane Scaling
## Running large clusters

EKS clusters by default are sized to handle up to 200 nodes and 30 pods per node. If your cluster exceeds this size, you can request a scale up through a support ticket. The EKS team is working on automatically scaling the control plane, at which point this will not be required.

## Limits and service quotas
## Know limits and service quotas

AWS sets service limits (an upper limit on the number of each resource your team can request) to protect you from accidentally over-provisioning resources. [Amazon EKS Service Quotas](https://docs.aws.amazon.com/eks/latest/userguide/service-quotas.html) lists the service limits. There are two types of limits, soft limits, that can be changed with proper justification via a support ticket. Hard limits cannot be changed. You should consider these values when architecting your applications. Consider reviewing these service limits periodically and incorporate them during in your application design.

Expand All @@ -110,5 +110,4 @@ AWS sets service limits (an upper limit on the number of each resource your team

- [De-mystifying cluster networking for Amazon EKS worker nodes](https://aws.amazon.com/blogs/containers/de-mystifying-cluster-networking-for-amazon-eks-worker-nodes/)
- [Amazon EKS cluster endpoint access control](https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html)
-
- [AWS re:Invent 2019: Amazon EKS under the hood (CON421-R1)](https://www.youtube.com/watch?v=7vxDWDD2YnM)
45 changes: 27 additions & 18 deletions content/reliability/docs/networkmanagement.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ Refer to [Cluster VPC considerations](https://docs.aws.amazon.com/eks/latest/use
If you deploy worker nodes in private subnets then these subnets should have a default route to a [NAT Gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html).

## Recommendations
- A VPC with public and private subnets is recommended so that Kubernetes can create load balancers in public subnets and the worker nodes can run in private subnets.
- If you deploy worker nodes in private subnets then consider creating a NAT Gateway in each Availability Zone to ensure zone-independent architecture. Each NAT gateway is created in a specific Availability Zone and implemented with redundancy in that zone.

### Deploy NAT Gateways in each Availability Zone
If you deploy worker nodes in private subnets then consider creating a NAT Gateway in each Availability Zone to ensure zone-independent architecture. Each NAT gateway is created in a specific Availability Zone and implemented with redundancy in that zone.


## Amazon VPC CNI
Expand All @@ -32,15 +33,27 @@ The CNI plugin has two componenets:
The details can be found in [Proposal: CNI plugin for Kubernetes networking over AWS VPC](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/cni-proposal.md).

## Recommendations
- Size the subnets you will use for Pod networking for growth. If you have insufficient IP addresses available in the subnet that the CNI uses, your pods will not get an IP address. And the pods will remain in pending state until an IP address becomes available.
- Consider using [CNI Metrics Helper](https://docs.aws.amazon.com/eks/latest/userguide/cni-metrics-helper.html) to monitor IP addresses inventory.
- If you use public subnets, then they must have the automatic public IP address assignment setting enabled otherwise worker nodes will not be able to communicate with the cluster.
- Consider creating [separate subnets for Pod networking](https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html) (also called CNI custom networking) to avoid IP address allocation conflicts between Pods and other resources in the VPC.
- If your Pods with private IP address need to communicate with other private IP address spaces (for example, Direct Connect, VPC Peering or Transit VPC), then you need to [enable external SNAT](https://docs.aws.amazon.com/eks/latest/userguide/external-snat.html) in the CNI:

```
kubectl set env daemonset -n kube-system aws-node AWS_VPC_K8S_CNI_EXTERNALSNAT=true
```
### Plan for growth

Size the subnets you will use for Pod networking for growth. If you have insufficient IP addresses available in the subnet that the CNI uses, your pods will not get an IP address. And the pods will remain in pending state until an IP address becomes available. This may impact application autoscaling and compromise its availability.

### Monitor IP address inventory

You can monitor the the IP addresses inventory of subnets using [CNI Metrics Helper](https://docs.aws.amazon.com/eks/latest/userguide/cni-metrics-helper.html). You can also set [CloudWatch alarms](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html) to get notified if a subnet is running out of IP addresses.

### Using public subnets for worker nodes
If you use public subnets, then they must have the automatic public IP address assignment setting enabled otherwise worker nodes will not be able to communicate with the cluster.

### Run worker nodes and pods in different subnets
Consider creating [separate subnets for Pod networking](https://docs.aws.amazon.com/eks/latest/userguide/cni-custom-network.html) (also called **CNI custom networking**) to avoid IP address allocation conflicts between Pods and other resources in the VPC.

### SNAT
If your Pods with private IP address need to communicate with other private IP address spaces (for example, Direct Connect, VPC Peering or Transit VPC), then you need to [enable external SNAT](https://docs.aws.amazon.com/eks/latest/userguide/external-snat.html) in the CNI:

```bash
kubectl set env daemonset -n kube-system aws-node AWS_VPC_K8S_CNI_EXTERNALSNAT=true
```

### Limit IP address pre-allocation

Expand All @@ -55,9 +68,9 @@ If you need to constrain the IP addresses the CNI caches then you can use these

To configure these options, you can download aws-k8s-cni.yaml compatible with your cluster and set environment variables. At the time of writing, the latest release is located [here](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/config/v1.6/aws-k8s-cni.yaml).

## Recommendations
- Configure the value of `MINIMUM_IP_TARGET` to closely match the number of Pods you expect to run on your nodes. This will ensure that as Pods get created the CNI can assign IP addresses from the warm pool without calling the EC2 API.
- Avoid setting the value of `WARM_IP_TARGET` too low as it will cause additional calls to the EC2 API and that might cause throttling of the requests.
!!! info Configure the value of `MINIMUM_IP_TARGET` to closely match the number of Pods you expect to run on your nodes. This will ensure that as Pods get created the CNI can assign IP addresses from the warm pool without calling the EC2 API.

!!! warning Avoid setting the value of `WARM_IP_TARGET` too low as it will cause additional calls to the EC2 API and that might cause throttling of the requests.

## CNI custom networking

Expand Down Expand Up @@ -120,15 +133,11 @@ You can then pass the `max-pods` value in the worker nodes’ user-data script:
Since the node’s primary ENI is no longer used to assign Pod IP addresses, there is a decline in the number of Pods you can run on a given EC2 instance type.
## Alternate CNI plugins
## Using alternate CNI plugins
AWS VPC CNI plugin is the only officially supported [network plugin](https://kubernetes.io/docs/concepts/cluster-administration/networking/) on EKS. However, since EKS runs upstream Kubernetes and is certified Kubernetes conformant, you can use alternate [CNI plugins](https://github.com/containernetworking/cni).
A compelling reason to opt for an alternate CNI plugin is the ability to run Pods without using a VPC IP address per Pod. Although, using an alternate CNI plugin can come at the expense of network performance.
Refer to EKS documentation for the list [alternate compatible CNI plugins](https://docs.aws.amazon.com/eks/latest/userguide/alternate-cni-plugins.html). Consider obtaining the CNI vendor’s commercial support if you plan on using an alternate CNI in production.
---
Things to add:
* Amazon EKS does not automatically upgrade the CNI plugin on your cluster when new versions are released. To get a newer version of the CNI plugin on existing clusters, you must manually upgrade the plugin.

0 comments on commit 122fcb9

Please sign in to comment.