Skip to content

Conversation

@KCSesh
Copy link
Contributor

@KCSesh KCSesh commented Apr 26, 2025

Issue number:

Closes #256

Description of changes:
Add containerd 2.0 as a package

Major Diffs:

Testing done:

ContainerD 2.0 Testing

0. Version running:

bash-5.1# ctr --version
ctr github.com/containerd/containerd/v2 2.0.5+bottlerocket
bash-5.1# containerd --version
containerd github.com/containerd/containerd/v2 2.0.5+bottlerocket fb4c30d4ede3531652d86197bf3fc9515e5276d9

Also from a cluster:

➜  git:(master) ✗ kubectl get nodes -o wide
NAME                                           STATUS        OS-IMAGE                                       KERNEL-VERSION   CONTAINER-RUNTIME
ip-<>.us-west-2.compute.internal.              Ready        Bottlerocket OS 1.38.0 (aws-k8s-1.32)          6.1.132          containerd://2.0.5+bottlerocket

1. Conformance test results on different Kernels:

Variant Name Test Status Passed Failed Skipped
aarch64-aws-k8s-126 passed 371 0 6701
aarch64-aws-k8s-126-nvidia passed 371 0 6701
aarch64-aws-k8s-132 passed 415 0 6213
aarch64-aws-k8s-132-fips passed 415 0 6213
aarch64-aws-k8s-132-nvidia passed 415 0 6213
x86-64-aws-k8s-126 passed 371 0 6701
x86-64-aws-k8s-126-nvidia passed 371 0 6701
x86-64-aws-k8s-132 passed 415 0 6213
x86-64-aws-k8s-132-fips passed 415 0 6213
x86-64-aws-k8s-132-nvidia passed 415 0 6213

2. Load test

Ran internal load test on 1.32 k8s AMI for 72Hr (so far - will update). Did not see any issue.
All metrics were stable during the testing:

No node failure.
No pod failure.
Network throughput is within the normal band.

3. NRI

Tested NRI Plugin: https://containers.github.io/nri-plugins/stable/docs/resource-policy/policy/topology-aware.html
Deployed 3 containers with separate resource requests.

Before NRI on x1e.32xlarge inspected: /proc/self/status in the containers:

* resources for multicontainer:c0-burstable
    Cpus_allowed_list:  0-127
    Mems_allowed_list:  0-3
* resources for multicontainer:c1-guaranteed
    Cpus_allowed_list:  0-127
    Mems_allowed_list:  0-3
* resources for multicontainer:c2-besteffort
    Cpus_allowed_list:  0-127
    Mems_allowed_list:  0-3

After NRI on x1e.32xlarge inspected: /proc/self/status in the containers:

* resources for multicontainer:c0-burstable
    Cpus_allowed_list:  16-31,80-95
    Mems_allowed_list:  1
* resources for multicontainer:c1-guaranteed
    Cpus_allowed_list:  32-47,96-111
    Mems_allowed_list:  2
* resources for multicontainer:c2-besteffort
    Cpus_allowed_list:  1-127
    Mems_allowed_list:  0-3

On m5.large and g5.large no change from before and after the topology.

On m6g.xlarge verified a simple self created hello world NRI plugin.

Recursive Read-only (RRO) mounts:

Verified new RRO feature:

/ # mount | grep mnt
/dev/nvme1n1p1 on /mnt-rro type xfs (ro,seclabel,nosuid,nodev,noatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=8,swidth=8,noquota)
none on /mnt-rro/tmpfs type tmpfs (ro,seclabel,relatime)
/dev/nvme1n1p1 on /mnt-ro type xfs (ro,seclabel,noatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=8,swidth=8,noquota)
none on /mnt-ro/tmpfs type tmpfs (rw,seclabel,relatime)
/dev/nvme1n1p1 on /mnt-rw type xfs (rw,seclabel,noatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=8,swidth=8,noquota)
none on /mnt-rw/tmpfs type tmpfs (rw,seclabel,relatime)
/ # touch /mnt-rro/tmpfs/test1
touch: /mnt-rro/tmpfs/test1: Read-only file system
/ # touch /mnt-ro/tmpfs/test1
/ # touch /mnt-rw/tmpfs/test2
/ # ls /mnt-rro/tmpfs/
test1  test2

Variant Build test:

  • Variant included packages test: containerd-2.0 on x86
bash-5.1# ls /usr/lib/systemd/system/containerd.service.d/
005-disable-pigz.conf
bash-5.1# ctr --version
ctr github.com/containerd/containerd/v2 2.0.5+bottlerocket

  • Variant included packages test: containerd-2.0(optimized-gunzip) on aarch64
bash-5.1# ls /usr/lib/systemd/system/containerd.service.d/
005-disable-igzip.conf
bash-5.1# ctr --version
ctr github.com/containerd/containerd/v2 2.0.5+bottlerocket

  • Variant included packages test: containerd-2.0(optimized-gunzip) on x86
bash-5.1# ls /usr/lib/systemd/system/containerd.service.d/
005-disable-pigz.conf
bash-5.1# ctr --version
ctr github.com/containerd/containerd/v2 2.0.5+bottlerocket

  • Variant included packages test: containerd-pigz
bash-5.1# ls /usr/lib/systemd/system/containerd.service.d/
005-disable-igzip.conf
bash-5.1# ctr --version
ctr github.com/containerd/containerd 1.7.27+bottlerocket

  • Variant included packages test: containerd-2.0 and containerd-pigz
bash-5.1# ls /usr/lib/systemd/system/containerd.service.d/
005-disable-igzip.conf
bash-5.1#  ctr --version
ctr github.com/containerd/containerd/v2 2.0.5+bottlerocket

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@KCSesh KCSesh changed the title Add containerd 2.0 as a package Add containerd 2.0 Apr 26, 2025
@KCSesh KCSesh mentioned this pull request Apr 28, 2025
2 tasks
@KCSesh KCSesh marked this pull request as ready for review April 29, 2025 22:11
@KCSesh KCSesh force-pushed the ctr-2 branch 2 times, most recently from 76dbdc6 to be95490 Compare April 30, 2025 22:08
@KCSesh
Copy link
Contributor Author

KCSesh commented May 1, 2025

^ Pushed suggested changes - still open to do further pause image verification.

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to comment on containerd-1.7.spec, but most of these fixes are needed there also.

KCSesh added 2 commits May 2, 2025 03:55
Signed-off-by: Kyle Sessions <kssessio@amazon.com>
Signed-off-by: Kyle Sessions <kssessio@amazon.com>
@KCSesh
Copy link
Contributor Author

KCSesh commented May 2, 2025

^ force push the suggested updates, and added additional Variant Build test: logic to the PR description to verify the different combinations of containerd and decompressors.

Additionally replied back on the pause container threads.

LimitCORE=infinity
LimitNOFILE=infinity
TasksMax=infinity
OOMScoreAdjust=-999
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this would be good to apply to containerd 1.7 also

Since we've been burned by similar changes in the past, where service changes "leak" across to containers, it would be good to double check that OOM scores for pod cgroups aren't affected.

The output of this should not look very different before and after:

head /proc/*/oom_score_adj

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before launched a simple pod with containerd 1.7:

sh-5.2# head /proc/*/oom_score_adj
==> /proc/1/oom_score_adj <==
981

==> /proc/23/oom_score_adj <==
981

==> /proc/24/oom_score_adj <==
1000

==> /proc/25/oom_score_adj <==
981

==> /proc/self/oom_score_adj <==
981

==> /proc/thread-self/oom_score_adj <==
981

After same pod on containerd 1.7 w/ OOMScoreAdjust=-999

sh-5.2# head /proc/*/oom_score_adj
==> /proc/1/oom_score_adj <==
981

==> /proc/23/oom_score_adj <==
981

==> /proc/24/oom_score_adj <==
1000

==> /proc/25/oom_score_adj <==
981

==> /proc/self/oom_score_adj <==
981

==> /proc/thread-self/oom_score_adj <==
981

"packages/conntrack-tools",
"packages/containerd",
"packages/containerd-1.7",
"packages/containerd-2.0",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to add another package for containerd-2.1, and all the subsequent 2.x minor versions too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have multiple versions of containerd in the core kit, yes they will be versioned out.
I think i'd like us to work towards getting back to 1 containerd package....eventually.

Signed-off-by: Kyle Sessions <kssessio@amazon.com>
@KCSesh
Copy link
Contributor Author

KCSesh commented May 2, 2025

^ Added OOMScoreAdjust to containerd 1.7 and fixed the spec comment.

Signed-off-by: Kyle Sessions <kssessio@amazon.com>
@KCSesh
Copy link
Contributor Author

KCSesh commented May 3, 2025

^ Added a missing comment - and formed an overall diff gist.

Copy link

@henry118 henry118 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KCSesh KCSesh merged commit 09e7e0a into bottlerocket-os:develop May 5, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade containerd to v2

4 participants