-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Base Image to Amazon Linux 2023 #1122
Conversation
Python2 is long deprecates and is not in the amazon linux 2023 repos
1fc70e1
to
1e431db
Compare
aws-cli v2 automatically uses signature v4: https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingAWSSDK.html#UsingAWSSDK-move-to-Sig4
Some tabs have to be kept because <<-HEREDOCs ignore tabs but not spaces. Also standardise some spacing.
The http calls are nicer because we don't have to format the output with `cut`
1e431db
to
8752ac0
Compare
}, | ||
{ | ||
"type": "shell", | ||
"script": "scripts/upgrade-kernel.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
@@ -1,2 +1,2 @@ | |||
buildkite-agent:1001:1 | |||
buildkite-agent:993:1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker rpm installed from the repos automatically creates a docker
group with this gid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My spidey senses tingle a bit with this change.
I vaguely remember some fragile things hanging off these subgid
mappings.
But, it's probably fine.
As long as we test it with a rootless container? Maybe via Buildkite's docker plugin or something?
@@ -24,4 +24,3 @@ KillMode=process | |||
|
|||
[Install] | |||
WantedBy=multi-user.target | |||
DefaultInstance=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed as this unit is not used as a template. Leaving it in spits out a warning in the journal
cat <<< "$(jq '."data-root"="/mnt/ephemeral/docker"' /etc/docker/daemon.json)" > /etc/docker/daemon.json | ||
fi | ||
|
||
# Customise address pools | ||
cat <<<"$(jq '."default-address-pools"=[{"base":"172.17.0.0/12","size":20},{"base":"192.168.0.0/16","size":24}]' /etc/docker/daemon.json)" >/etc/docker/daemon.json | ||
|
||
systemctl restart docker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to restart docker to get the new settings.
It's meant to be installed locally, not on EC2 instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are now systemd timers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to
elastic-ci-stack-for-aws/packer/linux/scripts/install-utils.sh
Lines 30 to 35 in 00e7a8b
GIT_LFS_VERSION=3.3.0 | |
echo "Installing git lfs ${GIT_LFS_VERSION}..." | |
pushd "$(mktemp -d)" | |
curl -sSL https://github.com/git-lfs/git-lfs/releases/download/v${GIT_LFS_VERSION}/git-lfs-linux-${ARCH}-v${GIT_LFS_VERSION}.tar.gz | tar xz | |
sudo git-lfs-${GIT_LFS_VERSION}/install.sh | |
popd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to
nvme-cli \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to
S3_SECRETS_HELPER_VERSION=2.1.6 | |
echo "Downloading s3-secrets-helper ${S3_SECRETS_HELPER_VERSION}..." | |
sudo curl -Lsf -o /usr/local/bin/s3secrets-helper \ | |
"https://github.com/buildkite/elastic-ci-stack-s3-secrets-hooks/releases/download/v${S3_SECRETS_HELPER_VERSION}/s3secrets-helper-linux-${ARCH}" | |
sudo chmod +x /usr/local/bin/s3secrets-helper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessary, the session manager plugin is used to allow logging into managed nodes with the aws-cli. A buildkite-agent typically (arguably ever) does not need to do that. It DOES need to be logged into by ssm, but the ssm-agent is there for that. We install the ssm agent on
amazon-ssm-agent \ |
sudo systemctl enable --now amazon-ssm-agent |
if [ "${MACHINE}" == "x86_64" ]; then | ||
echo "Downloading docker-compose..." | ||
sudo curl -Lsf -o /usr/bin/docker-compose https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-Linux-x86_64 | ||
sudo chmod +x /usr/bin/docker-compose | ||
docker-compose --version | ||
elif [[ "${MACHINE}" == "aarch64" ]]; then | ||
sudo yum install -y gcc-c++ libffi-devel openssl11 openssl11-devel python3-devel | ||
|
||
# docker-compose depends on the cryptography package, v3.4 of which | ||
# introduces a build dependency on rust; let's avoid that for now. | ||
# https://github.com/pyca/cryptography/blob/master/CHANGELOG.rst#34---2021-02-07 | ||
# This should be unpinned ASAP; hopefully docker-compose will offer binary | ||
# download for arm64 at some point: | ||
# https://github.com/docker/compose/issues/7472 | ||
CONSTRAINT_FILE="/tmp/docker-compose-pip-constraint" | ||
echo 'cryptography<3.4' >"$CONSTRAINT_FILE" | ||
echo 'urllib3<2' >"$CONSTRAINT_FILE" | ||
sudo pip3 install --constraint "$CONSTRAINT_FILE" "docker-compose==${DOCKER_COMPOSE_VERSION}" | ||
|
||
docker-compose version | ||
else | ||
echo "No docker compose option configured for arch ${MACHINE}" | ||
exit 1 | ||
fi | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't read this thoroughly enough to add a green tick, but it's looking most excellent 👍🏼
@@ -1,2 +1,2 @@ | |||
buildkite-agent:1001:1 | |||
buildkite-agent:993:1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My spidey senses tingle a bit with this change.
I vaguely remember some fragile things hanging off these subgid
mappings.
But, it's probably fine.
As long as we test it with a rootless container? Maybe via Buildkite's docker plugin or something?
}, | ||
{ | ||
"type": "shell", | ||
"script": "scripts/install-nvme-cli.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these have been folded into other scripts.
if [[ "$(uname -m)" == "aarch64" ]]; then | ||
AGENT_ARCH="arm64" | ||
else | ||
AGENT_ARCH="amd64" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere you've got a nice case statement, how about doing that consistently here?
if [[ "$(uname -m)" == "aarch64" ]]; then | |
AGENT_ARCH="arm64" | |
else | |
AGENT_ARCH="amd64" | |
fi | |
case $(uname -m) in | |
x86_64) ARCH=amd64;; | |
aarch64) ARCH=arm64;; | |
*) ARCH=unknown;; | |
esac |
(although this is very nitpicky, feel free to ignore me 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many more examples of sprinkled throughout the code base. I can't accept this as is thought, the variable was $AGENT_ARCH
, but the suggestion has it as $ARCH
. It's probably nicer to standardise on $ARCH if it does not exist in this script, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah whoops, i missed that $AGENT_ARCH
! Glad someone is paying attention 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to
LIFECYCLED_VERSION=v3.3.0 | |
echo "Installing lifecycled ${LIFECYCLED_VERSION}..." | |
sudo touch /etc/lifecycled | |
sudo curl -Lf -o /usr/bin/lifecycled \ | |
https://github.com/buildkite/lifecycled/releases/download/${LIFECYCLED_VERSION}/lifecycled-linux-${ARCH} | |
sudo chmod +x /usr/bin/lifecycled | |
sudo curl -Lf -o /etc/systemd/system/lifecycled.service \ | |
https://raw.githubusercontent.com/buildkite/lifecycled/${LIFECYCLED_VERSION}/init/systemd/lifecycled.unit |
…t instead of "success" Co-authored-by: paul david <423357+toothbrush@users.noreply.github.com>
Ah interesting, and yes i imagine that would be clearest - allowing a wholesale copy-pastable command to be easy to see. Thanks for indulging my nits, i am learning things about goss too 👍 On 7 Jun 2023, at 10:03, Narthana Epa ***@***.***> wrote:
@triarius commented on this pull request.
In goss.yaml:
+ # Check docker userns is enabled
+ docker info --format='{{ `{{range .SecurityOptions}}{{if eq . "name=userns"}}{{"success"}}{{end}}{{end}}` }}':
Yeah, and mine has the opposite issue.
I could put the whole line in a nested template, that might be the best of both words.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
They need to have some level of nested templates because otherwise, goss will attempt to evaluate the templates that are intended to be evaluated by docker. Also, the elements of `stdout` are regexes, so we adapt it to test for each expected element of the list. Goss will report which regex did not match the output, so we can use this determine which plugin are missing. It won't be able to tell if there are other plugins, but that is very much a feature.
…all-elastic-stack.sh
db766d3
to
3c7c3a5
Compare
This was removed in the migration to Amazon Linux 3 with the belief that there was no use case for the Session Manager Plugin being present on the agent[1]. It has since been realised that it is quite useful. For example, the agent may be used to start ECS Tasks which perform work in other environments/network configurations (such as Database migrations during deployments). 1. #1122 (comment)
This was removed in the migration to Amazon Linux 3 with the belief that there was no use case for the Session Manager Plugin being present on the agent[1]. It has since been realised that it is quite useful. For example, the agent may be used to start ECS Tasks which perform work in other environments/network configurations (such as Database migrations during deployments). 1. #1122 (comment)
This was removed in the migration to Amazon Linux 3 with the belief that there was no use case for the Session Manager Plugin being present on the agent[1]. It has since been realised that it is quite useful. For example, the agent may be used to start ECS Tasks which perform work in other environments/network configurations (such as Database migrations during deployments). 1. #1122 (comment)
This was removed in the migration to Amazon Linux 3 with the belief that there was no use case for the Session Manager Plugin being present on the agent[1]. It has since been realised that it is quite useful. For example, the agent may be used to start ECS Tasks which perform work in other environments/network configurations (such as Database migrations during deployments). 1. #1122 (comment)
Thanks for bearing with us everyone, we have finally gotten around to baking Amazon Linux 2023 base images and getting them to run builds!
Although @toothbrush made a valiant effort in #1103, their, and our previous attempts were DoA as the user data startup scripts were failing for various reasons. Thus, the instance would be marked as unhealthy soon after they booted and the ASG would continuously be booting new instances to replace them.
While fixing this, I took the opportunity to upgrade a few packages, and I've preferred to install these through the AL2023 repos as much as possible. Some older packages have been replaced where their functionality was provided by other packages that are available by default.
Consider adding these to the changelog at the time of release.
Added
Changed
Upgraded
Removed
CI