-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the blob storage in pkg/registry pluggable. #1158
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1158 +/- ##
==========================================
- Coverage 75.22% 74.96% -0.26%
==========================================
Files 108 108
Lines 7850 7982 +132
==========================================
+ Hits 5905 5984 +79
- Misses 1379 1424 +45
- Partials 566 574 +8
Continue to review full report at Codecov.
|
6767eef
to
c09136c
Compare
With this change folks can use `registry.WithBlobHandler(...)` to substitute a custom blob handling layer.
c09136c
to
130a80c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, if you're interested we can do manifests next (roughly the same way), and just leave uploads for later when/if we ever need it.
pkg/registry/blobs.go
Outdated
@@ -102,10 +151,20 @@ func (b *blobs) handle(resp http.ResponseWriter, req *http.Request) *regError { | |||
} | |||
} | |||
|
|||
resp.Header().Set("Content-Length", fmt.Sprint(len(b))) | |||
b, ok := b.bh.Get(repo, h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can wrap the returned ReadCloser in a internal/verify.ReadCloser
to automagically check the digest matches before returning it.
If we do it in here, we'll need to be sure to check the error on defer b.Close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are a bunch of additional checks we could add, so I'm sorta reluctant to start pulling that particular thread in this PR. Another notable bit is that we don't check the possible err return value on io.Copy()
below, which at one point was the source of a fun bug in the Docker client (around digest verification!).
I think my personal bias would be towards leaning on internal consistency and checking things on their way in (vs. on the way out), but I'd be happy to iterate with you on follow up PRs here.
@imjasonh I'll optimistically start to look at what we might be able to do with manifests tomorrow. |
@jonjohnsonjr PTAL |
"math/rand" | ||
"net/http" | ||
"path" | ||
"strings" | ||
"sync" | ||
|
||
"github.com/google/go-containerregistry/pkg/name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to keep this from having any dependencies outside the stdlib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too late:
go-containerregistry/pkg/registry/manifest.go
Lines 31 to 32 in 0dfbb56
v1 "github.com/google/go-containerregistry/pkg/v1" | |
"github.com/google/go-containerregistry/pkg/v1/types" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I agree with this goal for implementations of the interfaces, but it seems wasteful to reimplement validation that exists in the name
and similar packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jon beat me to it 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment stands. It's foolish to reimplement the validation we have in name
and other places that let us avoid dealing with stringly typed data, and I don't see the risk of a dependency cycle for these superficial things because they should absolutely NOT need a full-blown registry implementation to test interfaces.
I could absolutely get behind a depcheck
style test that this never depend on pkg/v1/remote
in particular.
5e9e18e
to
2a9295a
Compare
Message: err.Error(), | ||
} | ||
} | ||
repo, err := name.NewRepository(req.URL.Host + path.Join(elem[1:len(elem)-2]...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Resurrecting this because I'm sad we couldn't make it happen)
Aside from the question of whether we need to depend on pkg/name
here, I wonder if req.URL.Host
is adding anything besides letting the string pass through NewRepository
. Really all blob storers want is the subrepo path.
I hear the concern about duplicating validation logic though. Would it be reasonable to export a method from pkg/name
to validate a subrepo path just so pkg/registry
could use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sad we couldn't make it happen
Me too, but alas I've unblocked myself (and it is glorious 😉 ).
I hear the concern about duplicating validation logic though
We have a bigger issue here. This repo is now apparently governed by unwritten rules in one person's head, which also have had no rationale articulated anywhere that was arrived at by consensus of the stakeholders in this library.
I've clearly demonstrated a willingness to send PRs here (and elsewhere) to fix things, move things forward, and I'm happy to engage in productive discussions about them, but waiting a week to get a Nack based on unwritten rules (that are already violated with what is checked in), is frankly both aggravating and unsustainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go-containerregistry/pkg/registry/registry.go
Lines 15 to 23 in a0c4bd2
// Package registry implements a docker V2 registry and the OCI distribution specification. | |
// | |
// It is designed to be used anywhere a low dependency container registry is needed, with an | |
// initial focus on tests. | |
// | |
// Its goal is to be standards compliant and its strictness will increase over time. | |
// | |
// This is currently a low flightmiles system. It's likely quite safe to use in tests; If you're using it | |
// in production, please let us know how and send us CL's for integration tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order for this to disqualify this change, "low dependency" must be interpreted to exclude what it currently imports. 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'd recommend a README: https://github.com/google/go-containerregistry#philosophy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a bigger issue here. This repo is now apparently governed by unwritten rules in one person's head, which also have had no rationale articulated anywhere that was arrived at by consensus of the stakeholders in this library.
Hoping to address this in #1166
Taking the issue of dependency philosophy aside, I think I'm still convinced that the dependency on pkg/name
is unnecessary, or at least having a full name.Repository
in the interface is. If not for validation of the repository path, you don't actually need a full name.Repository
, and could just get by with Get(string path, h v1.Hash)
, and not add any new dependencies. Path validation itself is fairly simple, and could be moved to internal/something
and shared by pkg/name
and pkg/registry
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are assuming details of the blob layer by stating that it can only care about path. Every single cloud vendor registry uses host to route storage requests, Amazon even gives each customer their own hostname per region. I doubt this is how you wrote kontain.me, but….
Hopefully you get my point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see what you're saying. The same code depending on pkg/registry with some blob storage impl, running at foo.registry.example
and bar.registry.example
, might decide to store those blobs differently based on the host.
Can we document this expectation in the doc comments for BlobHandler
? It wasn't immediately clear to me (because I'm a big dumb dummy) why the host was included, except to make path validation easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document this expectation in the doc comments for BlobHandler
Absolutely.
|
||
// Get returns true and a reader for consuming the blob specified with the hash, | ||
// if it exists. It now, it returns (nil, error). | ||
Get(repo name.Repository, h v1.Hash) (io.ReadCloser, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something occurred to me about this API, thinking through how I could use it with kontain.me -- this API would require kontain.me to proxy and serve blobs from GCS, instead of HTTP redirecting like I do today, which is a lot faster and cheaper than proxying through Cloud Run.
This might just mean that pkg/registry
isn't going to be a good fit for kontain.me, and that's fine, but I wanted to surface that here since I think it's a pretty nice use case that isn't supported by this proposal.
With this change folks can use
registry.WithBlobHandler(...)
to substitute a custom blob handling layer.