Skip to content

Commit

Permalink
Cache pause, vpc-cni, and kube-proxy images during build (#938)
Browse files Browse the repository at this point in the history
  • Loading branch information
bwagner5 authored Nov 18, 2022
1 parent 670b3f2 commit 057f3e4
Show file tree
Hide file tree
Showing 8 changed files with 338 additions and 125 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
PACKER_BINARY ?= packer
PACKER_VARIABLES := aws_region ami_name binary_bucket_name binary_bucket_region kubernetes_version kubernetes_build_date kernel_version docker_version containerd_version runc_version cni_plugin_version source_ami_id source_ami_owners source_ami_filter_name arch instance_type security_group_id additional_yum_repos pull_cni_from_github sonobuoy_e2e_registry ami_regions volume_type
PACKER_VARIABLES := $(shell $(PACKER_BINARY) inspect -machine-readable eks-worker-al2.json | grep 'template-variable' | awk -F ',' '{print $$4}')

K8S_VERSION_PARTS := $(subst ., ,$(kubernetes_version))
K8S_VERSION_MINOR := $(word 1,${K8S_VERSION_PARTS}).$(word 2,${K8S_VERSION_PARTS})
Expand Down
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,104 @@ Provisioner](https://www.packer.io/docs/provisioners/shell.html) runs the
necessary configuration tasks. Then, Packer creates an AMI from the instance
and terminates the instance after the AMI is created.

### Container Image Caching

Optionally, some container images can be cached during the AMI build process in order to reduce the latency of the node getting to a `Ready` state when launched.

To turn on container image caching:

```
cache_container_images=true make 1.23
```

When container image caching is enabled, the following images are cached:
- 602401143452.dkr.ecr.<AWS_REGION>.amazonaws.com/eks/kube-proxy:<default and latest>-eksbuild.<BUILD_VERSION>
- 602401143452.dkr.ecr.<AWS_REGION>.amazonaws.com/eks/kube-proxy:<default and latest>-minimal-eksbuild.<BUILD_VERSION>
- 602401143452.dkr.ecr.<AWS_REGION>.amazonaws.com/eks/pause:3.5
- 602401143452.dkr.ecr.<AWS_REGION>.amazonaws.com/amazon-k8s-cni-init:<default and latest>
- 602401143452.dkr.ecr.<AWS_REGION>.amazonaws.com/amazon-k8s-cni:<default and latest>

The account ID can be different depending on the region and partition you are building the AMI in. See [here](https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html) for more details.

Since the VPC CNI is not versioned with K8s itself, the latest version of the VPC CNI and the default version, based on the response from the EKS DescribeAddonVersions at the time of the AMI build, will be cached.

The images listed above are also tagged with each region in the partition the AMI is built in, since images are often built in one region and copied to others within the same partition. Images that are available to pull from an ECR FIPS endpoint are also tagged as such (i.e. `602401143452.dkr.ecr-fips.us-east-1.amazonaws.com/eks/pause:3.5`).

When listing images on a node, you'll notice a long list of images. However, most of these images are simply tagged in different ways with no storage overhead. Images cached in the AMI total around 1.0 GiB. In general, a node with no images cached using the VPC CNI will use around 500 MiB of images when in a `Ready` state with no other pods running on the node.

### IAM Permissions

To build the EKS Optimized AMI, you will need the following permissions:

```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CopyImage",
"ec2:CreateImage",
"ec2:CreateKeypair",
"ec2:CreateSecurityGroup",
"ec2:CreateSnapshot",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:DeleteKeyPair",
"ec2:DeleteSecurityGroup",
"ec2:DeleteSnapshot",
"ec2:DeleteVolume",
"ec2:DeregisterImage",
"ec2:DescribeImageAttribute",
"ec2:DescribeImages",
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeRegions",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSnapshots",
"ec2:DescribeSubnets",
"ec2:DescribeTags",
"ec2:DescribeVolumes",
"ec2:DetachVolume",
"ec2:GetPasswordData",
"ec2:ModifyImageAttribute",
"ec2:ModifyInstanceAttribute",
"ec2:ModifySnapshotAttribute",
"ec2:RegisterImage",
"ec2:RunInstances",
"ec2:StopInstances",
"ec2:TerminateInstances",
"eks:DescribeAddonVersions",
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer"
],
"Resource": "arn:aws:ecr:us-west-2:602401143452:repository/*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::amazon-eks/*"
}
]
}
```

You will need to use the region you are building the AMI in to specify the ECR repository resource in the second IAM statement. You may also need to change the account if you are building the AMI in a different partition or special region. You can see a mapping of regions to account ID [here](https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html).
If you're using a custom s3 bucket to vend different K8s binaries, you will need to change the resource in the third IAM statement above to reference your custom bucket.
For more information about the permissions required by Packer with different configurations, see the [docs](https://www.packer.io/plugins/builders/amazon#iam-task-or-instance-role).

## Using the AMI

If you are just getting started with Amazon EKS, we recommend that you follow
Expand Down
9 changes: 7 additions & 2 deletions eks-worker-al2.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"aws_session_token": "{{env `AWS_SESSION_TOKEN`}}",
"binary_bucket_name": "amazon-eks",
"binary_bucket_region": "us-west-2",
"cache_container_images": "false",
"cni_plugin_version": "v0.8.6",
"containerd_version": "1.6.6-1.amzn2.0.2",
"creator": "{{env `USER`}}",
Expand All @@ -23,7 +24,8 @@
"kms_key_id": "",
"kubernetes_build_date": null,
"kubernetes_version": null,
"launch_block_device_mappings_volume_size": "4",
"launch_block_device_mappings_volume_size": "8",
"pause_container_version": "3.5",
"pull_cni_from_github": "true",
"remote_folder": "",
"runc_version": "1.1.3-1.amzn2.0.2",
Expand Down Expand Up @@ -161,7 +163,10 @@
"AWS_ACCESS_KEY_ID={{user `aws_access_key_id`}}",
"AWS_SECRET_ACCESS_KEY={{user `aws_secret_access_key`}}",
"AWS_SESSION_TOKEN={{user `aws_session_token`}}",
"SONOBUOY_E2E_REGISTRY={{user `sonobuoy_e2e_registry`}}"
"SONOBUOY_E2E_REGISTRY={{user `sonobuoy_e2e_registry`}}",
"PAUSE_CONTAINER_VERSION={{user `pause_container_version`}}",
"KUBE_PROXY_VERSION_SUFFIX={{user `kube_proxy_version_suffix`}}",
"CACHE_CONTAINER_IMAGES={{user `cache_container_images`}}"
]
},
{
Expand Down
74 changes: 13 additions & 61 deletions files/bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -178,51 +178,6 @@ SERVICE_IPV6_CIDR="${SERVICE_IPV6_CIDR:-}"
ENABLE_LOCAL_OUTPOST="${ENABLE_LOCAL_OUTPOST:-}"
CLUSTER_ID="${CLUSTER_ID:-}"

function get_pause_container_account_for_region() {
local region="$1"
case "${region}" in
ap-east-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-800184023465}"
;;
me-south-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-558608220178}"
;;
cn-north-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-918309763551}"
;;
cn-northwest-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-961992271922}"
;;
us-gov-west-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-013241004608}"
;;
us-gov-east-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-151742754352}"
;;
us-iso-east-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-725322719131}"
;;
us-isob-east-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-187977181151}"
;;
af-south-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-877085696533}"
;;
eu-south-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-590381155156}"
;;
ap-southeast-3)
echo "${PAUSE_CONTAINER_ACCOUNT:-296578399912}"
;;
me-central-1)
echo "${PAUSE_CONTAINER_ACCOUNT:-759879836304}"
;;
*)
echo "${PAUSE_CONTAINER_ACCOUNT:-602401143452}"
;;
esac
}

# Helper function which calculates the amount of the given resource (either CPU or memory)
# to reserve in a given resource range, specified by a start and end of the range and a percentage
# of the resource to reserve. Note that we return zero if the start of the resource range is
Expand Down Expand Up @@ -314,8 +269,8 @@ if [[ "$MACHINE" != "x86_64" && "$MACHINE" != "aarch64" ]]; then
exit 1
fi

PAUSE_CONTAINER_ACCOUNT=$(get_pause_container_account_for_region "${AWS_DEFAULT_REGION}")
PAUSE_CONTAINER_IMAGE=${PAUSE_CONTAINER_IMAGE:-$PAUSE_CONTAINER_ACCOUNT.dkr.ecr.$AWS_DEFAULT_REGION.$AWS_SERVICES_DOMAIN/eks/pause}
ECR_URI=$(/etc/eks/get-ecr-uri.sh "${AWS_DEFAULT_REGION}" "${AWS_SERVICES_DOMAIN}" "${PAUSE_CONTAINER_ACCOUNT:-}")
PAUSE_CONTAINER_IMAGE=${PAUSE_CONTAINER_IMAGE:-$ECR_URI/eks/pause}
PAUSE_CONTAINER="$PAUSE_CONTAINER_IMAGE:$PAUSE_CONTAINER_VERSION"

### kubelet kubeconfig
Expand Down Expand Up @@ -525,29 +480,26 @@ if [[ "$CONTAINER_RUNTIME" = "containerd" ]]; then

sudo mkdir -p /etc/containerd
sudo mkdir -p /etc/cni/net.d
mkdir -p /etc/systemd/system/containerd.service.d
cat << EOF > /etc/systemd/system/containerd.service.d/10-compat-symlink.conf
[Service]
ExecStartPre=/bin/ln -sf /run/containerd/containerd.sock /run/dockershim.sock
EOF
if [[ -n "$CONTAINERD_CONFIG_FILE" ]]; then
sudo cp -v $CONTAINERD_CONFIG_FILE /etc/eks/containerd/containerd-config.toml
fi
echo "$(jq '.cgroupDriver="systemd"' $KUBELET_CONFIG)" > $KUBELET_CONFIG
sudo sed -i s,SANDBOX_IMAGE,$PAUSE_CONTAINER,g /etc/eks/containerd/containerd-config.toml
sudo cp -v /etc/eks/containerd/containerd-config.toml /etc/containerd/config.toml
sudo cp -v /etc/eks/containerd/sandbox-image.service /etc/systemd/system/sandbox-image.service

# Check if the containerd config file is the same as the one used in the image build.
# If different, then restart containerd w/ proper config
if ! cmp -s /etc/eks/containerd/containerd-config.toml /etc/containerd/config.toml; then
sudo cp -v /etc/eks/containerd/containerd-config.toml /etc/containerd/config.toml
sudo cp -v /etc/eks/containerd/sandbox-image.service /etc/systemd/system/sandbox-image.service
sudo chown root:root /etc/systemd/system/sandbox-image.service
systemctl daemon-reload
systemctl enable containerd sandbox-image
systemctl restart sandbox-image containerd
fi
sudo cp -v /etc/eks/containerd/kubelet-containerd.service /etc/systemd/system/kubelet.service
sudo chown root:root /etc/systemd/system/kubelet.service
sudo chown root:root /etc/systemd/system/sandbox-image.service
# Validate containerd config
sudo containerd config dump > /dev/null
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd
systemctl enable sandbox-image
systemctl start sandbox-image

elif [[ "$CONTAINER_RUNTIME" = "dockerd" ]]; then
mkdir -p /etc/docker
bash -c "/sbin/iptables-save > /etc/sysconfig/iptables"
Expand Down
54 changes: 54 additions & 0 deletions files/get-ecr-uri.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env bash
set -euo pipefail

# More details about the mappings in this file can be found here https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html

region=$1
aws_domain=$2
if [[ $# -eq 3 ]] && [[ ! -z $3 ]]; then
acct=$3
else
case "${region}" in
ap-east-1)
acct="800184023465"
;;
me-south-1)
acct="558608220178"
;;
cn-north-1)
acct="918309763551"
;;
cn-northwest-1)
acct="961992271922"
;;
us-gov-west-1)
acct="013241004608"
;;
us-gov-east-1)
acct="151742754352"
;;
us-iso-east-1)
acct="725322719131"
;;
us-isob-east-1)
acct="187977181151"
;;
af-south-1)
acct="877085696533"
;;
eu-south-1)
acct="590381155156"
;;
ap-southeast-3)
acct="296578399912"
;;
me-central-1)
acct="759879836304"
;;
*)
acct="602401143452"
;;
esac
fi

echo "${acct}.dkr.ecr.${region}.${aws_domain}"
27 changes: 27 additions & 0 deletions files/pull-image.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/usr/bin/env bash

img=$1
region=$(echo "${img}" | cut -f4 -d ".")
MAX_RETRIES=3

function retry() {
local rc=0
for attempt in $(seq 0 $MAX_RETRIES); do
rc=0
[[ $attempt -gt 0 ]] && echo "Attempt $attempt of $MAX_RETRIES" 1>&2
"$@"
rc=$?
[[ $rc -eq 0 ]] && break
[[ $attempt -eq $MAX_RETRIES ]] && exit $rc
local jitter=$((1 + RANDOM % 10))
local sleep_sec="$(($((5 << $((1 + $attempt)))) + $jitter))"
sleep $sleep_sec
done
}

ecr_password=$(retry aws ecr get-login-password --region $region)
if [[ -z ${ecr_password} ]]; then
echo >&2 "Unable to retrieve the ECR password."
exit 1
fi
retry sudo ctr --namespace k8s.io image pull "${img}" --user AWS:${ecr_password}
28 changes: 3 additions & 25 deletions files/pull-sandbox-image.sh
Original file line number Diff line number Diff line change
@@ -1,27 +1,5 @@
#!/usr/bin/env bash
set -euo pipefail

### fetching sandbox image from /etc/containerd/config.toml
sandbox_image=$(awk -F'[ ="]+' '$1 == "sandbox_image" { print $2 }' /etc/containerd/config.toml)
region=$(echo "$sandbox_image" | cut -f4 -d ".")
ecr_password=$(aws ecr get-login-password --region $region)
API_RETRY_ATTEMPTS=5

for attempt in $(seq 0 $API_RETRY_ATTEMPTS); do
rc=0
if [[ $attempt -gt 0 ]]; then
echo "Attempt $attempt of $API_RETRY_ATTEMPTS"
fi
### pull sandbox image from ecr
### username will always be constant i.e; AWS
sudo ctr --namespace k8s.io image pull $sandbox_image --user AWS:$ecr_password
rc=$?
if [[ $rc -eq 0 ]]; then
break
fi
if [[ $attempt -eq $API_RETRY_ATTEMPTS ]]; then
exit $rc
fi
jitter=$((1 + RANDOM % 10))
sleep_sec="$(($((5 << $((1 + $attempt)))) + $jitter))"
sleep $sleep_sec
done
sandbox_image="$(awk -F'[ ="]+' '$1 == "sandbox_image" { print $2 }' /etc/containerd/config.toml)"
/etc/eks/containerd/pull-image.sh "${sandbox_image}"
Loading

0 comments on commit 057f3e4

Please sign in to comment.