Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

registry.k8s.io/liveness image in liveness example only works on AMD64 CPUs #45809

Closed
spurin opened this issue Apr 9, 2024 · 17 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@spurin
Copy link
Contributor

spurin commented Apr 9, 2024

This is a Bug Report

Problem:

The liveness example as per https://github.com/kubernetes/website/blob/main/content/en/examples/pods/probe/http-liveness.yaml -

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

Makes use of image registry.k8s.io/liveness. This fails when running on arm based architectures like apple silicon.

Logs as follows when attempting to use this example -

# kubectl logs pod/liveness-http
runtime: failed to create new OS thread (have 2 already; errno=22)
fatal error: runtime.newosproc

goroutine 16 [running]:
runtime.throw(0x7de6c6)
        /usr/lib/google-golang/src/pkg/runtime/panic.c:520 +0x69 fp=0x2aaaab45fec0 sp=0x2aaaab45fea8
runtime.newosproc(0x4c208016000, 0x2aaaab47dfa8)
        /usr/lib/google-golang/src/pkg/runtime/os_linux.c:150 +0xfa fp=0x2aaaab45ff10 sp=0x2aaaab45fec0
newm(0x4158e0, 0x0)
        /usr/lib/google-golang/src/pkg/runtime/proc.c:933 +0xc9 fp=0x2aaaab45ff50 sp=0x2aaaab45ff10
runtime.main()
        /usr/lib/google-golang/src/pkg/runtime/proc.c:219 +0x3c fp=0x2aaaab45ffa8 sp=0x2aaaab45ff50
runtime.goexit()
        /usr/lib/google-golang/src/pkg/runtime/proc.c:1445 fp=0x2aaaab45ffb0 sp=0x2aaaab45ffa8
created by _rt0_go
        /usr/lib/google-golang/src/pkg/runtime/asm_amd64.s:97 +0x12

Proposed Solution:

Ideally, this image should be updated to support multiple architectures. Sadly I've not been able to find either the source code for this image or the owner. It has been in use as a standard for many years.

Alternatively, the creation of a new multi-architecture image that can be used as a drop in replacement may be beneficial.

Page to Update:

https://github.com/kubernetes/website/blob/main/content/en/examples/pods/probe/http-liveness.yaml

@spurin spurin added the kind/bug Categorizes issue or PR as related to a bug. label Apr 9, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 9, 2024
@spurin
Copy link
Contributor Author

spurin commented Apr 9, 2024

@stmcginnis
Copy link
Contributor

Just noting - we have a lot of recurring issues pop up for ARM (e.g. #45420) due to tutorial images not being multi-arch.

Not sure of the best way to address this holistically, but it does seem there needs to be some larger effort to either replace these tutorial instructions to only use multi-arch images, or have some process in place that maintains and keep these tutorial images up to date so they are more likely to be usable for most people.

@spurin
Copy link
Contributor Author

spurin commented Apr 9, 2024

Thank you @stmcginnis

Agreed, it would be great to have multi-architecture images as standard.

I think this example also highlights another issue and that is the lack of supply chain transparency with some of the images that we readily rely on in the documentation, especially in light of recent events like xz.

Prior to raising this issue, I spent considerable time unsuccessfully looking for the source code for the image in question.

This example is one which I believe would be used by many, given that it's a common example for learning/using liveness/readiness probes, not to mention it being a popular topic for those with the likes of the CKAD examination.

It may be worthwhile us expanding these discussions to include security provisions for efforts/improvements in this space.

@sftim
Copy link
Contributor

sftim commented Apr 10, 2024

Would someone like to request that the tutorials aim to use multi-arch container images? If that's you, you can file a feature request.

@sftim
Copy link
Contributor

sftim commented Apr 10, 2024

The fix for this report is to say that the example only works on AMD64 CPUs.

@spurin
Copy link
Contributor Author

spurin commented Apr 10, 2024

Thanks for your feedback @sftim

I'll update the issue as per your feedback and I would also like to pick up the feature request for this.

@spurin spurin changed the title registry.k8s.io/liveness image in liveness example fails on arm based architectures registry.k8s.io/liveness image in liveness example only works on AMD64 CPUs Apr 10, 2024
@spurin
Copy link
Contributor Author

spurin commented Apr 10, 2024

@sftim - For this issue where registry.k8s.io/liveness is not available as a multi-arch image, I was able to find the source code for the container image in question but I was not able to build an image from it.

It looks as if historically the file referenced for this was part of a bigger code base/toolset and even though it's referenced in documentation, it doesn't build as is without modifications.

This is the referenced source code for the liveness image

I simplified this and created an equivalent version which satisfies the test and at the same time, have built this as a multi-arch container image. Details as follows -

Source Code: https://github.com/spurin/liveness
Docker Hub: https://hub.docker.com/r/spurin/liveness

Would it be acceptable for me raise a PR changing the current documentation image from registry.k8s.io/liveness to spurin/liveness or is there a specific requirement to have referenced images like this on registry.k8s.io

Thanks

@sftim
Copy link
Contributor

sftim commented Apr 10, 2024

Would it be acceptable for me raise a PR changing the current documentation image from registry.k8s.io/liveness to spurin/liveness or is there a specific requirement to have referenced images like this on registry.k8s.io

I don't think we'd merge that. We prefer to reference images from a supply chain with obvious grounds that we can trust it, and in general one contributor's personal set of images doesn't pass the quality bar.

How about opening a PR against https://github.com/kubernetes/kubernetes/ instead?

Also see https://github.com/kubernetes/kubernetes/blob/master/test/images/agnhost/README.md

@spurin
Copy link
Contributor Author

spurin commented Apr 10, 2024

Thanks for your candour @sftim, I thought this was the case but was also interested in this from a self learning viewpoint for some exam study so not all lost, my own personal testing image for arm in the interim :-)

I found the source code for the liveness image in that subset that you've referenced but I wasn't able to find anything relating to the build of the registry.k8s.io/liveness image. In it's current form it also seems to be a module (package liveness) rather than a direct executable and I'm guessing there was some kind of additional wrapper code to get it to where it is today.

It may be best, starting with the ownership/management of that registry.k8s.io/liveness image and working backwards. If you have any recommendations on who to reach out to on this I'll follow up accordingly.

@sftim
Copy link
Contributor

sftim commented Apr 10, 2024

Start with #sig-release on the Kubernetes Slack (need an invitation? https://slack.k8s.io/)

@dims
Copy link
Member

dims commented Apr 11, 2024

@spurin
Copy link
Contributor Author

spurin commented Apr 11, 2024

Thanks for your feedback @dims

I'm unsure historically, how we got from https://github.com/kubernetes/kubernetes/blob/master/test/images/agnhost/liveness/server.go

to the container image of registry.k8s.io/liveness ?

The agnhost container image appears to be independent of this. It accepts parameters. An example is shown for liveness under the main root of that source tree -

kubectl exec test-agnhost -- /agnhost liveness

The current readiness/liveness example in the documentation doesn't appear to be using the agnhost image in this way, it looks as if it is an independent image -

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

I was able to get the agnhost container image working with the liveness/readiness example with relative ease and this image is already multi-arch 👍 Here's the yaml tested successfully on arm64 -

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: registry.k8s.io/e2e-test-images/agnhost:2.40
    args:
    - liveness
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

If this is a positive direction, I can raise a PR to change the existing liveness example to use the agnhost image. Please share your thoughts.

@sftim
Copy link
Contributor

sftim commented Apr 11, 2024

If this is a positive direction, I can raise a PR to change the existing liveness example to use the agnhost image. Please share your thoughts.

Yep, please go ahead with that.

@dims
Copy link
Member

dims commented Apr 11, 2024

@spurin we went the other way registry.k8s.io/liveness was the really one old one and we consolidated a bunch of images into agnhost. we just forgot to update the documention.

yes, please go ahead as @sftim already mentioned.

@spurin
Copy link
Contributor Author

spurin commented Apr 11, 2024

Thanks @dims - appreciate the context and background on this, makes complete sense now :-)

@spurin
Copy link
Contributor Author

spurin commented Apr 18, 2024

Closing this, moved to agnhost image, multi-arch images as standard being discussed here - #45822

@spurin spurin closed this as completed Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants