This project is an example of how you can combine the AWS Cloud Development Kit (CDK) and the AWS Elastic Kubernetes Serivce (EKS) to quickly deploy a more complete and "production ready" Kubernetes environment on AWS.
- An appropriate VPC (/22 CDIR w/1024 IPs by default - though you can edit this in
eks_cluster.py) with public and private subnets across three availabilty zones. - A new EKS cluster with:
- A dedicated new IAM role to create it from. The role that creates the cluster is a permanent, and rather hidden, full admin role that doesn't appear in nor is subject to the aws-auth config map. So, you want a dedicated role explicity for that purpose like CDK does for you here that you can then restrict access to assume unless you need it (e.g. you lock yourself out of the cluster with by making a mistake in the aws-auth configmap).
- A new Managed Node Group with 3 x m5.large instances spread across 3 Availability Zones.
- The AWS Load Balancer Controller (https://kubernetes-sigs.github.io/aws-load-balancer-controller) to allow you to seamlessly use ALBs for Ingress and NLB for Services.
- External DNS (https://github.com/kubernetes-sigs/external-dns) to allow you to automatically create/update Route53 entries to point your 'real' names at your Ingresses and Services.
- A new managed Amazon Elasticsearch Domain and an aws-for-fluent-bit DaemonSet (https://github.com/aws/aws-for-fluent-bit) to ship all your container logs there - including enriching them with the Kubernetes metadata using the kubernetes fluent-bit filter.
- (Temporarily until the AWS Managed Prometheus/Grafana are available) The kube-prometheus Operator (https://github.com/prometheus-operator/kube-prometheus) which gives you a Prometheus that will collect all your cluster metrics as well as a Grafana to visualise them.
- TODO: Add some initial alerts for sensible common items in the cluster via Prometheus/Alertmanager
- The AWS EBS CSI Driver (https://github.com/kubernetes-sigs/aws-ebs-csi-driver)
- The AWS EFS CSI Driver (https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html)
- A OPA Gatekeeper to enforce prevenetative secruity and operational policies (https://github.com/open-policy-agent/gatekeeper)
- TODO: Add some sensible initial policies to make our cluster 'secure by default'
- The cluster autoscaler (CA) (https://github.com/kubernetes/autoscaler)
- The metrics-server (required for the Horizontal Pod Autoscaler (HPA)) (https://github.com/kubernetes-sigs/metrics-server)
- TODO: A GitOps Pipeline based on CodeBuild doing another
cdk deploywhenevereks_cluster.pychanges
TODO: Explain the benefits of the CDK
There are some prerequsistes you likely will need to install on the machine doing your environment bootstrapping including Node, Python, the AWS CLI, the CDK, fluxctl and Helm
Run sudo ./ubuntu-prepreqs.sh
TODO: Make equivilent bootstrap script to get a Mac
TODO: Make an equivilent boostrap script for Amazon Linux 2 including Cloud 9
- Make sure that you have your AWS CLI configured with administrative access to the AWS account in question (e.g. an
aws s3 lsworks)- This can be via setting your access key and secret in your .aws folder via
aws configureor in your environment variables by copy and pasting from AWS SSO etc.
- This can be via setting your access key and secret in your .aws folder via
- Run
cd eks-quickstart/cluster-bootstrap - Run
pip install -r requirements.txtto install the required Python bits of the CDK - Run
export CDK_DEPLOY_REGION=ap-southeast-2replacing ap-southeast-2 with your region of choice - Run
export CDK_DEPLOY_ACCOUNT=123456789123replacing 123456789123 with your AWS account number - (Only required the first time you use the CDK in this account) Run
cdk bootstrapto create the S3 bucket where it puts the CDK puts its artifacts - (Only required the first time ES in VPC mode is used in this account) Run
aws iam create-service-linked-role --aws-service-name es.amazonaws.com - Run
cdk deploy --require-approval never - (Temporary until it is added to our Helm Chart - PR open) Run
kubectl edit configmap fluentbit-0-1-6-aws-for-fluent-bit --namespace=cluster-addonsand add the following to the bottomReplace_Dots On
- Create the necessary keys and upload them to ACM as per https://docs.aws.amazon.com/vpn/latest/clientvpn-admin/client-authentication.html#mutual
- Run
cd client-vpn - Edit
client_vpn.pyand put the ARNs for your client and server certs as well as theclient_cidr_blockandtarget_network_cidrif required - Run
pip install -r requirements.txt - Run
cdk deploy --require-approval never - Go to the Client VPN Endpoints Service in the AWS Console
- Go to the Associations Tab and click Associate
- Pick the EKSClusterStack/VPC for the VPC
- Pick any subnet in the Choose a subnet to associate dropdown box
- Click the Associate button
- Go to the Security Groups tab
- Click the Apply Security Groups button
- Tick the box next to the security group whose Group Name starts with
eks-cluster-sg-clusterXXXXXXX...(this is the SG that has access to the EKS Control Plane private endpoints) - Click Apply Security Groups
TODO: Complete/improve the VPN instructions including how to set up the client
We put the Elasticsearch both in the VPC (i.e. not on the Internet) as well as in the same Security Group we use for controlling access to our EKS Control Plane.
We did this so that if we put the Client VPN in that security group as well then it will have access from a network perspective to both manage EKS and Elasticsearch/Kibana.
Since this ElasticSearch can only be reached from a network perspective if you are running within this VPC, or have private access to it via a VPN or DirectConnect, then it is not that risky to allow 'open access' to it - especially in a Proof of Concept (POC) environment.
In order to do this:
- Go to the Amazon Elaticsearch Service within the AWS Console
- Click on the Domain that starts with eksclus-
- Click on the Actions button on top and choose Modify Access Policy
- In the Domain access policy dropdown choose "Allow open access to the domain" and click Submit
TODO: Add instructions for the first-time Kibana Index setup
TODO: Walk through how to do a few basic things in Kibana with your cluster logs
TODO: Walk through how to get to the out-of-the-box metrics dashboards in Grafana
TODO: Walk through deploying some apps that show off some of the cluster add-ons we've installed
TODO: Walk through how to do an EKS Cluster to a new Kubernetes version and/or the Managed Node Group to the latest AMI upgrade via CDK
TODO: Walk through how to upgrade an individual add-on manifest/chart via CDK