Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GC performance] The performance of v2 manifest deletion is not good in S3 environment #12948

Open
wy65701436 opened this issue Sep 2, 2020 · 45 comments

Comments

@wy65701436
Copy link
Contributor

wy65701436 commented Sep 2, 2020

In S3 backend environment, we found that it took about 39 seconds to delete a manifest via v2 API.

[why still use v2 to handle manifest deletion]

As Harbor cannot know the tags belong to the manifest in the storage, the GC job needs to leverage the v2 API to clean them. But, the v2 API will look up all of tags, and remove them one by one. This may cause performance issue.

[what we can do next]

1, Investigate how many requests send to S3 storage within the v2 manifest deletion.
2, Investigate the possibility of not to store the first tag in the backend, then GC job can skip this step.

Log

Sep 1 12:56:35 192.168.144.1 registry[1146]: time="2020-09-01T12:56:35.750530108Z" level=info msg="authorized request" go.version=go1.13.8 http.request.host="registry:5000" http.request.id=c9a4d5ad-4157-4091-a023-93d8e20a5746 http.request.method=DELETE http.request.remoteaddr="192.168.144.9:44072" http.request.uri="/v2/library/testingg/manifests/sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" http.request.useragent=harbor-registry-client vars.name="library/testingg" 
vars.reference="sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" 
Sep 1 12:57:14 192.168.144.1 registry[1146]: time="2020-09-01T12:57:14.340710966Z" level=info msg="response completed" go.version=go1.13.8 http.request.host="registry:5000" http.request.id=c9a4d5ad-4157-4091-a023-93d8e20a5746 http.request.method=DELETE http.request.remoteaddr="192.168.144.9:44072" http.request.uri="/v2/library/testingg/manifests/sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" http.request.useragent=harbor-registry-client http.response.duration=38.598453034s http.response.status=202 http.response.written=0 
@dkulchinsky
Copy link
Contributor

Hey @wy65701436, following our chat in Slack I'd like to share similar performance issue we're experiencing with a GCS storage backend.

We're running Harbor v2.1.1 and we replicated a GCR registry content to Harbor, however we forgot to exclude a repo that had at the time >60,000 tags.

After replication completed we deleted the repo in Harbor and ran GC, but the job keeps failing due to timeout to delete manifest:

2020-11-03T16:44:21Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:259]: delete the manifest with registry v2 API: fastly/demo-go-app, application/vnd.docker.distribution.manifest.v2+json, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c

2020-11-03T17:14:21Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:262]: failed to delete manifest with v2 API, fastly/demo-go-app, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c, Delete "http://harbor-registry:5000/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2020-11-03T17:14:21Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:165]: failed to execute GC job at sweep phase, error: failed to delete manifest with v2 API: fastly/demo-go-app, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c: Delete "http://harbor-registry:5000/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

looking at the registry logs we see that it takes over an hour to delete a manifest:

[harbor-registry-b4fbbb8df-xcgt4 registry] 127.0.0.1 - - [03/Nov/2020:16:44:21 +0000] "DELETE /v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c HTTP/1.1" 202 0 "" "harbor-registry-client"

[harbor-registry-b4fbbb8df-xcgt4 registry] time="2020-11-03T17:56:14.798548519Z" level=info msg="response completed" go.version=go1.14.7 http.request.host="harbor-registry:5000" http.request.id=3b01c244-6b14-49ba-bfde-1bfd7934c15e http.request.method=DELETE http.request.remoteaddr="127.0.0.1:59854" http.request.uri="/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c" http.request.useragent=harbor-registry-client http.response.duration=1h11m53.181188629s http.response.status=202 http.response.written=0

We enabled debug log in registry and we saw it was spending most of the time iterating though the tags with gcs.GetContent:

❯ grep GetContent harbor-registry-debug-logs.txt|grep demo-go-app|wc -l
   55525

for example:

[harbor-registry-b4fbbb8df-xcgt4 registry] time="2020-11-03T16:44:27.205475289Z" level=debug msg="gcs.GetContent("/docker/registry/v2/repositories/fastly/demo-go-app/_manifests/tags/0023fd10bee5f3cc968e55148169091eb7d1cf795a8780ee7508642ab047042b/current/link")" auth.user.name="harbor_registry_user" go.version=go1.14.7 http.request.host="harbor-registry:5000" http.request.id=3b01c244-6b14-49ba-bfde-1bfd7934c15e http.request.method=DELETE http.request.remoteaddr="127.0.0.1:59854" http.request.uri="/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c" http.request.useragent=harbor-registry-client trace.duration=110.978462ms trace.file="/go/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).GetContent" trace.id=6374bc55-1500-4fb9-bcf0-f53da9f5fa16 trace.line=95 vars.name="fastly/demo-go-app" vars.reference="sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c"

I can share the full GC job and registry debug logs if needed, also happy to provide more information.

@guyguy333
Copy link

Experiencing exactly the same issue as @dkulchinsky. We're unable to end a GC. It always end with a context deadline exceeded. We've more than 1Tb to clean (~130k objects). We can't resume so we have to start from scratch again.

@dkulchinsky
Copy link
Contributor

dkulchinsky commented Oct 21, 2021

Hello friends, I'd like to ask to raise the priority of this issue.

We are running several instances of Harbor (we use GCS backend, but I think the root cause here is the same) and we're rapidly growing our usage.

We are starting to reach capacities that the GC simply cannot handle, repositories with more than a few thousand tags is taking ~2 minutes to delete a single manifest during GC, GC is now taking 10~14 hours and the problem is getting worse every day since we're adding more tags then we are deleting.

on our test/certification Harbor instance we've reached over 20,000 tags on some repositories and GC just times out on the first manifest since the lookup takes >20 minutes.

We are concerned about increasing our storage costs since we can't clean it up as well as other potential issues that may arise from having all these blobs/manifests lingering with no ability to properly clean them up.

this issue was tagged as a candidate for v2.2.0, and we're already seeing v2.4.0 going out the door.

I'm happy to provide additional context, information, logs but just hope we can have some attention on this issues since I think it will impact any user that needs Harbor to work at scale.

/cc @wy65701436 @reasonerjt

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Stale label Apr 16, 2022
@dkulchinsky
Copy link
Contributor

I believe this is still an active issue being tracked, so probably shouldn't get closed yet.

@stale stale bot removed the Stale label Apr 17, 2022
@github-actions
Copy link

github-actions bot commented Jul 7, 2022

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Jul 7, 2022
@sidewinder12s
Copy link

This is still an issue.

@github-actions github-actions bot removed the Stale label Jul 8, 2022
@github-actions
Copy link

github-actions bot commented Sep 7, 2022

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Sep 7, 2022
@rcjames
Copy link

rcjames commented Sep 7, 2022

This is still an issue.

@dkulchinsky
Copy link
Contributor

@wy65701436 any hints on when the team may find some time to look at this? this seems like an issue that requires desperate attention however it didn't see any traction in over 2 years now.

@github-actions github-actions bot removed the Stale label Sep 8, 2022
@twhiteman
Copy link
Contributor

Just to further explain the crux of this issue.

Harbor is using the docker distribution for it's (harbor-registry) registry component.

The harbor GC will call into the docker registry to delete the manifest, which in turn then does a lookup for all tags that reference the deleted manifest:
https://github.com/distribution/distribution/blob/78b9c98c5c31c30d74f9acb7d96f98552f2cf78f/registry/handlers/manifests.go#L536

To find the tag references, the docker registry will iterate every tag in the repository and read it's link file to check if it matches the deleted manifest (i.e. to see if uses the same sha256 digest):
https://github.com/distribution/distribution/blob/78b9c98c5c31c30d74f9acb7d96f98552f2cf78f/registry/storage/tagstore.go#L160

So, the more tags you have in your repository, the worse the performance will be (as there will be more s3 API calls occurring for the tag directory lookups and tag file reads).

@github-actions
Copy link

github-actions bot commented Jan 2, 2023

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Jan 2, 2023
@captn3m0
Copy link

captn3m0 commented Jan 2, 2023

not stale.

@Vad1mo
Copy link
Member

Vad1mo commented Jun 14, 2023

@wy65701436, take a look #12948 (comment)

@sebglon
Copy link

sebglon commented Nov 14, 2023

Any update?

@jwojnarowicz
Copy link

Has anyone tested @hemanth132 solution? Or are there any updates from Harbor team regarding the API or GC? @Vad1mo @chlins Bump because it's still an important issue regarding usage with every S3 backend.

@karmicdude
Copy link

Any update? Still causing a huge pain, GC works out slower than data is added, resulting in having to constantly extend disks

microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
@microyahoo
Copy link
Contributor

hi @karmicdude, I have taken over @Antiarchitect's efforts with concurrent lookup and untag in PR distribution/distribution#4329. You can try it and check whether it has improvement, thanks.

microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 26, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
@microyahoo
Copy link
Contributor

hi @karmicdude, I have taken over @Antiarchitect's efforts with concurrent lookup and untag in PR distribution/distribution#4329. You can try it and check whether it has improvement, thanks.

hi @karmicdude, @sebglon, @jwojnarowicz @sidewinder12s , distribution/distribution#4329 has already merged, would you please help to try whether it meets expectations.

@karmicdude
Copy link

Nice, I'll definitely check it out

@snowmanstark
Copy link

@wy65701436 @Vad1mo Can we have this change in v2.11.1 as this will improve the GC efficiency.

@kofj
Copy link
Contributor

kofj commented Aug 1, 2024

Running GC on s3 like storage is so expensive, our many customers has storage more the 2 PiB and 900 million objects, they want quickly and high performance GC. I am interested in this work.

@Antiarchitect
Copy link

Antiarchitect commented Aug 1, 2024

@kofj FYI It seems like concurrent tag lookup feature kindly picked up by @microyahoo was merged into distribution: distribution/distribution#4329 but there is no release tag with this MR yet as far as I can tell.

@twhiteman
Copy link
Contributor

Whilst concurrent tag lookup is slightly better, it does not solve the underlying performance issue (it still has to lookup and read all tag files on the S3 filesystem for the referenced repository).

For better solutions, I'd like to see one of:

  1. docker distribution uses a reverse tag lookup system (returns all tags referencing the given sha256 in a particular repository)
  2. Harbor performs the deletion of the s3 manifest file itself (and no longer calls the docker delete manifest API)

@wy65701436
Copy link
Contributor Author

Whilst concurrent tag lookup is slightly better, it does not solve the underlying performance issue (it still has to lookup and read all tag files on the S3 filesystem for the referenced repository).

For better solutions, I'd like to see one of:

1. docker distribution uses a reverse tag lookup system (returns all tags referencing the given sha256 in a particular repository)

2. Harbor performs the deletion of the s3 manifest file itself (and no longer calls the docker delete manifest API)

I prefer the option 3, which is harbor doesn't perform the tag deletion at all. Why harbor still leverages this api is because that the tag on pushing still be landed into the distribution side. However, actually, it is not necessary since harbor uses the database to do the CRUD of tag.

In summary, we can store all the artifacts in tag-less in the distribution side. So then, the deletion is not needed, but we should consider about the existing artifact when we try to update the logic.

Another quick solution is that, we can give an option for the end user to let them decide whether removing the tag from the backend. It's no harm but generates some garbage in the storage side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests