-
Notifications
You must be signed in to change notification settings - Fork 161
fix(dedupe): use a common blobs dir to simplify #3314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Should pair with #2968 |
6edb9fb to
b4c3c46
Compare
| DefaultGCInterval = 1 * time.Hour | ||
| S3StorageDriverName = "s3" | ||
| LocalStorageDriverName = "local" | ||
| GlobalBlobsRepo = "_blobstore" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this blobs repo is per store and substore. Because substores may be on different partitions.
4b2da37 to
da2884a
Compare
49521a6 to
23f1ebc
Compare
andaaron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logging is too verbose, and should be trimmed
| substore := c.StoreController.SubStore[route] | ||
| if substore != nil { | ||
| substore.RunDedupeBlobs(time.Duration(0), c.taskScheduler) | ||
| //substore.RunDedupeBlobs(time.Duration(0), c.taskScheduler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just temporary, or we're dropping dedupe altogether?
| return nil | ||
| } | ||
|
|
||
| // create nested deduped bucket where we store all the deduped blobs + original blob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // create nested deduped bucket where we store all the deduped blobs + original blob | |
| // create nested deduped bucket where we store all the deduped blobs while excluding the original blob |
Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logging is too verbose for merge. It would be too much information to include logs for the code paths not returning errors.
|
|
||
| dedupeBlob := d.getOne(deduped) | ||
| if dedupeBlob != nil { | ||
| d.log.Debug().Str("digest", digest.String()).Str("path", path).Msg("more in dedupe bucket, leaving original alone") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous logic was replacing the old original blob with one of the duplicates, and then delete the blob.
This new logic would keep the original blob around forever? How/when would it be GCed?
| } | ||
|
|
||
| return nil | ||
| return zerr.ErrCacheMiss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under which scenario would this return the error?
The updated code is a bit confusing as it is returning nil in several other places and it is unclear in which scenario this line is reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Are there no other needed changes in this file?
- Also the redis implementation should be reviewed
| } | ||
|
|
||
| origin := bucket.Bucket([]byte(constants.OriginalBucket)) | ||
| if origin != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if origin is nil, then we should return nil, because the blob does not exist.
It is not in the duplicates list and it is not the original, so it does not exist, so the deletion should be marked as successful, correct?
| } | ||
|
|
||
| // skip the global blobs repo | ||
| if repo == constants.GlobalBlobsRepo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is fishy.
Why is this here? constants.GlobalBlobsRepo is not an actual repo because it has an invalid repo name. How is GC detecting it when "walking" the disk if it doesn't have a valid repo name? It shouldn't call this function at all for constants.GlobalBlobsRepo?
pkg/storage/imagestore/imagestore.go
Outdated
|
|
||
| return err | ||
| } | ||
| goto retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're refactoring this, can you pick up #2969?
Or do you want me to rebase and merge that in advance?
Is this code working right now, wouldn't it potentially execute the code multiple times?
if err := is.storeDriver.Move(src, gdst); err != nil {
is.log.Error().Err(err).Str("src", src).Str("dst", gdst).Str("component", "dedupe").
Msg("failed to rename blob")
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the deduplication scheme to simplify blob management by introducing a centralized global blob storage location (_blobstore) instead of maintaining master copies within individual repository directories. This eliminates complex repository ownership changes when images are deleted and reduces lock contention.
Key Changes:
- Introduces a
GlobalBlobsRepoconstant (_blobstore) to serve as a centralized master copy location for all deduplicated blobs - Refactors validation logic by moving it from
initRepotoInitRepoto allow internal initialization of the special global blobs repository - Updates cache deletion logic in BoltDB and DynamoDB drivers to handle the new deduplication scheme
- Modifies garbage collection to include the global blobs repository
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/storage/constants/constants.go | Adds GlobalBlobsRepo constant defining the centralized blob storage location |
| pkg/storage/imagestore/imagestore.go | Refactors deduplication logic to use global blobs repo, moves validation from initRepo to InitRepo |
| pkg/storage/cache/boltdb.go | Updates cache deletion logic to check duplicates bucket first and handle global blob storage |
| pkg/storage/cache/dynamodb.go | Adds duplicate blob check before deletion in DynamoDB cache driver |
| pkg/storage/cache/boltdb_test.go | Updates test expectations for new cache behavior (contains debug statements) |
| pkg/storage/cache_test.go | Updates test expectations for cache deletion behavior |
| pkg/storage/storage_test.go | Adds GlobalBlobsRepo to test candidates list and skips it in image writing loop |
| pkg/storage/gc/gc.go | Adds garbage collection logic for global blobs repository (contains commented code and FIXMEs) |
| pkg/api/controller.go | Comments out RunDedupeBlobs calls (temporarily disabled functionality) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Enable running dedupe blobs both ways (dedupe or restore deduped blobs) | ||
| c.StoreController.DefaultStore.RunDedupeBlobs(time.Duration(0), c.taskScheduler) | ||
| // c.StoreController.DefaultStore.RunDedupeBlobs(time.Duration(0), c.taskScheduler) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out dedupe functionality should be removed if it's no longer needed, or should have a clear comment explaining why it's temporarily disabled with a tracking issue reference.
| } | ||
|
|
||
| func (d *BoltDBDriver) HasBlob(digest godigest.Digest, blob string) bool { | ||
| d.log.Debug().Str("digest", digest.String()).Str("blob", "blob").Msg("checking blob in cache") |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter name "blob" is logged as the string literal "blob" instead of the actual variable value. This should be Str("blob", blob) instead of Str("blob", "blob").
| d.log.Debug().Str("digest", digest.String()).Str("blob", "blob").Msg("checking blob in cache") | |
| d.log.Debug().Str("digest", digest.String()).Str("blob", blob).Msg("checking blob in cache") |
| So(err, ShouldBeNil) | ||
|
|
||
| So(blobs, ShouldResemble, []string{"second"}) | ||
| //So(blobs, ShouldResemble, []string{"second"}) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code should be removed. If the expected behavior has changed, remove the old assertion entirely rather than commenting it out.
| //So(blobs, ShouldResemble, []string{"second"}) |
| // FIXME: also keep cache in sync | ||
| // FIXME: we don't have dedupe flag here! |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME comments indicate incomplete work. These should be addressed before merging or converted to TODO comments with tracking issue references.
| // FIXME: also keep cache in sync | |
| // FIXME: we don't have dedupe flag here! | |
| // TODO: also keep cache in sync (see issue #1234) | |
| // TODO: we don't have dedupe flag here! (see issue #1234) |
|
|
||
| // cache record exists, but due to GC and upgrades from older versions, | ||
| // disk content and cache records may go out of sync | ||
| is.log.Debug().Bool("relpath?", is.cache.UsesRelativePaths()).Msg("check this") |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug logging statement with hardcoded string "check this" should be removed or given a more meaningful message before merging.
| is.log.Debug().Bool("relpath?", is.cache.UsesRelativePaths()).Msg("check this") | |
| is.log.Debug().Bool("relpath?", is.cache.UsesRelativePaths()).Msg("checking if cache uses relative paths for dedupe") |
| err = cacheDriver.DeleteBlob("key", "bogusValue") | ||
| So(err, ShouldBeNil) | ||
| So(err, ShouldEqual, errors.ErrCacheMiss) | ||
| //So(err, ShouldBeNil) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code should be removed. If this assertion is no longer valid, remove it entirely rather than commenting it out.
| //So(err, ShouldBeNil) |
|
|
||
| So(blobs, ShouldResemble, []string{"second", "third"}) | ||
| So(blobs, ShouldResemble, []string{"first", "second", "third"}) | ||
| //So(blobs, ShouldResemble, []string{"second", "third"}) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code should be removed. If the expected behavior has changed, remove the old assertion entirely rather than commenting it out.
| //So(blobs, ShouldResemble, []string{"second", "third"}) |
|
|
||
| err = cacheDriver.DeleteBlob("key", "bogusValue") | ||
| So(err, ShouldBeNil) | ||
| //So(err, ShouldBeNil) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented out code should be removed. If the expected behavior has changed, remove the old assertion entirely rather than commenting it out.
| //So(err, ShouldBeNil) |
| /* | ||
| if err := gc.cleanRepo(ctx, constants.GlobalBlobsRepo); err != nil { | ||
| errMessage := "failed to run GC for " + path.Join(gc.imgStore.RootDir(), constants.GlobalBlobsRepo) | ||
| gc.log.Error().Err(err).Str("module", "gc").Msg(errMessage) | ||
| gc.log.Info().Str("module", "gc"). | ||
| Msg("gc unsuccessfully completed for " + path.Join(gc.imgStore.RootDir(), constants.GlobalBlobsRepo)) | ||
| return err | ||
| } | ||
| */ | ||
|
|
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Large block of commented out code should be removed. If this logic is not needed yet, consider removing it or creating a TODO with a tracking issue instead of leaving commented code.
| /* | |
| if err := gc.cleanRepo(ctx, constants.GlobalBlobsRepo); err != nil { | |
| errMessage := "failed to run GC for " + path.Join(gc.imgStore.RootDir(), constants.GlobalBlobsRepo) | |
| gc.log.Error().Err(err).Str("module", "gc").Msg(errMessage) | |
| gc.log.Info().Str("module", "gc"). | |
| Msg("gc unsuccessfully completed for " + path.Join(gc.imgStore.RootDir(), constants.GlobalBlobsRepo)) | |
| return err | |
| } | |
| */ | |
| // TODO: Consider running cleanRepo for constants.GlobalBlobsRepo if/when needed. See issue tracker for details. |
| // FIXME: this seems incorrect | ||
| return nil |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME comment with "this seems incorrect" indicates a known issue. This should be resolved before merging, as the code appears to be silently returning nil on validation errors.
| // FIXME: this seems incorrect | |
| return nil | |
| // return the validation error so the caller is aware of the failure | |
| return err |
Currently, zot uses one of the repos as the master copy for a blob to achieve dedupe. However, blobs can be deleted from repos and this complicates dedupe tracking logic. Now use a single hidden global repo as a blob store instead. Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Signed-off-by: Ramkumar Chinchani <rchincha.dev@gmail.com>
Currently, our dedupe scheme is very complicated since master copy is kept in one of the repo dirs and if that image is deleted, the owning repo is changed, requiring multiple repo locks causing unnecessary contention.
What type of PR is this?
Which issue does this PR fix:
What does this PR do / Why do we need it:
If an issue # is not available please add repro steps and logs showing the issue:
Testing done on this change:
Automation added to e2e:
Will this break upgrades or downgrades?
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.