Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Sync download blobs already available in the local repository #2661

Open
AlbanBedel opened this issue Sep 17, 2024 · 5 comments
Open
Assignees
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers

Comments

@AlbanBedel
Copy link

zot version

v2.1.1

Describe the bug

The sync code always download every blobs behind a tag even if they are already available locally.

To reproduce

  1. Setup zot as pass through cache
  2. Inspect an image index, for example using skopeo inspect docker://registry.example.com/repo/image:v1.2.3
  3. Inspect one of image referenced in the index, for example using skopeo inspect docker://registry.example.com/repo/image:v1.2.3-arm64
  4. Notice how slow it is although all blobs are already in storage, or check the repo/image/.sync/ directory to see that it download existing blobs again.

Expected behavior

Blobs from the same repository are not downloaded again when already available in local storage.

Screenshots

No response

Additional context

No response

@AlbanBedel AlbanBedel added the bug Something isn't working label Sep 17, 2024
@rchincha rchincha added the rm-external Roadmap item submitted by non-maintainers label Sep 17, 2024
@rchincha
Copy link
Contributor

@AlbanBedel We take into account the case where images are being tagged with the same tag, meaning image with digest1 tagged with "tag1", oops!, pushed image with digest2 but retagged as same "tag1".

That said, we do make sure blobs/manifests don't get copied over if they already exist locally.

Is this not what you are seeing?

@AlbanBedel
Copy link
Author

AlbanBedel commented Sep 18, 2024

The case here is to pull an index tag that references blobs A,B,C and D. Then pull an image tag that reference only A and B. In this case A and B get downloaded again, with larger blobs this is very obvious alone from the time it takes.

@andaaron
Copy link
Contributor

andaaron commented Sep 18, 2024

Hi @AlbanBedel, are those docker images or OCI images at docker://registry.example.com/repo/image? Or can you send us a link to the upstream image for which this issue reproduces?

For docker format we convert the manifest and config blobs locally to OCI. It does not however explain not reusing the blobs which contain the image file system.

@AlbanBedel
Copy link
Author

No, they are OCI artifacts, so they have OCI media type along with custom artifactType. The images I'm testing with are in a private registry, I'll see if I can reproduce it with some public images.

@eusebiu-constantin-petu-dbk
Copy link
Collaborator

Yes you are right, we just compare the parent digest (index digest in this case), we don't check each child...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers
Projects
None yet
Development

No branches or pull requests

4 participants