Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting the Id of an image and not just the name:tag #11347

Open
afbjorklund opened this issue May 8, 2021 · 4 comments
Open

Getting the Id of an image and not just the name:tag #11347

afbjorklund opened this issue May 8, 2021 · 4 comments
Labels
area/image Issues/PRs related to the minikube image subcommand kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@afbjorklund
Copy link
Collaborator

afbjorklund commented May 8, 2021

Currently we are using the name:tag of an image, which has some issues

  • it fails to reload changed images, such as :latest
  • it fails to recognize re-tagged images, as duplicates

Previously we have also used digests, which is another can of worms

  • they are not preserved, when saving to an archive
  • they vary depending on the registry, require network

So it would be better to add support for the "id" to our image loading code.

This is calculated based on the contents of the image itself, and also in CRI.

It looks something like this: sha256:c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698

Instead of busybox:latest (short name) or docker.io/library/busybox:latest (canonical name)

Then we can compare this with crictl images

(or docker images, when not using CRI)

So we should avoid digests, which is confusing because they look similar:

$ docker pull busybox:latest
latest: Pulling from library/busybox
Digest: sha256:be4684e4004560b2cd1f12148b7120b0ea69c385bcc9b12a637537a2c60f97fb
Status: Image is up to date for busybox:latest
docker.io/library/busybox:latest

And instead use the "image id", for separating two images from eachother:

$ docker images busybox:latest
REPOSITORY   TAG       IMAGE ID       CREATED      SIZE
busybox      latest    c55b0f125dc6   4 days ago   1.24MB
$ docker image inspect busybox:latest | head
[
    {
        "Id": "sha256:c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698",
        "RepoTags": [
            "busybox:latest"
        ],
        "RepoDigests": [
            "busybox@sha256:be4684e4004560b2cd1f12148b7120b0ea69c385bcc9b12a637537a2c60f97fb"
        ],
        "Parent": "",

Note that the image id changes with the architecture, while the repo digest remains the same.

$ docker images busybox:latest
REPOSITORY   TAG       IMAGE ID       CREATED      SIZE
busybox      latest    f6467c4e9e15   4 days ago   1.4MB
$  docker image inspect busybox:latest | head
[
    {
        "Id": "sha256:f6467c4e9e1526a6e856444fde786794014c628ab8d64a2eaca5ca1a95ff13de",
        "RepoTags": [
            "busybox:latest"
        ],
        "RepoDigests": [
            "busybox@sha256:be4684e4004560b2cd1f12148b7120b0ea69c385bcc9b12a637537a2c60f97fb"
        ],
        "Parent": "",

We can still use digests for the kicbase of course, this was about handling in cache and image.

      --base-image='gcr.io/k8s-minikube/kicbase:v0.0.22@sha256:7cc3a3cb6e51c628d8ede157ad9e1f797e8d22a1b3cedc12d3f1999cb52f962e'

Some pseudo-code

Docker

"github.com/docker/docker/client"
	cli, err := client.NewClientWithOpts(client.FromEnv)
	cli.NegotiateAPIVersion(ctx)
	img, _, err := cli.ImageInspectWithRaw(ctx, ref)
	id := img.ID

similar to the CLI:

    docker image inspect --format "{{ .Id }}" $ref

Podman

(no daemon, no api - we just call the CLI instead.)

    sudo podman image inspect --format "sha256:{{ .Id }}" $ref

podman doesn't have sha256: prefix in id? add it, for comparison

need to make sure to add it to the crictl output, when using cri-o

Tarball

"github.com/google/go-containerregistry/pkg/v1/tarball"
             img, err := tarball.ImageFromPath(path, nil)
		cn, err := img.ConfigName()
		id := cn.String()

Implementation details:

  • The Id is actually the checksum of the config file:
$ docker save busybox:latest > busybox_latest.tar
$ tar -tf busybox_latest.tar 
a355ae7461fdd43484ed16e7e48620ff19b187adc03bcd4b5cfd5ba3ce2ee670/
a355ae7461fdd43484ed16e7e48620ff19b187adc03bcd4b5cfd5ba3ce2ee670/VERSION
a355ae7461fdd43484ed16e7e48620ff19b187adc03bcd4b5cfd5ba3ce2ee670/json
a355ae7461fdd43484ed16e7e48620ff19b187adc03bcd4b5cfd5ba3ce2ee670/layer.tar
c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698.json
manifest.json
repositories
$ tar -xf busybox_latest.tar --wildcards "*.json"
$ jq . manifest.json 
[
  {
    "Config": "c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698.json",
    "RepoTags": [
      "busybox:latest"
    ],
    "Layers": [
      "a355ae7461fdd43484ed16e7e48620ff19b187adc03bcd4b5cfd5ba3ce2ee670/layer.tar"
    ]
  }
]
$ sha256sum c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698.json
c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698  c55b0f125dc65ee6a9a78307d9a2dfc446e96af7477ca29ddd4945fd398cc698.json
  • It is possible to have more than one image per archive, so we need can give the tag when getting the id (or nil, to get all)
  • Since we only need the config and not the layers, the operation is still fast even for large images (such as python:latest)

Previous issues:

@afbjorklund afbjorklund added the kind/feature Categorizes issue or PR as related to a new feature. label May 8, 2021
@spowelljr spowelljr added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label May 17, 2021
@sharifelgamal sharifelgamal added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 14, 2021
@sharifelgamal
Copy link
Collaborator

related: #11322

@k8s-triage-robot

This comment was marked as outdated.

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 12, 2021
@k8s-triage-robot

This comment was marked as outdated.

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 12, 2021
@spowelljr spowelljr added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Oct 13, 2021
@spowelljr spowelljr added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Feb 16, 2022
@afbjorklund afbjorklund added the area/image Issues/PRs related to the minikube image subcommand label Mar 12, 2023
@STRRL
Copy link

STRRL commented Sep 5, 2023

Hi @afbjorklund , I am interested in working on this issue, could you assign it to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/image Issues/PRs related to the minikube image subcommand kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

6 participants