Description
This is kind of a meta-issue about several registry-related concerns. I can file separate issues if that's helpful.
I stumbled across this repo while doing some investigation around deprecating part of the distribution spec, which is primarily why I'm filing this issue, but I figured I'd point out some other things I noticed.
I see that you're using github.com/google/containerregistry for some things and have a custom client for other things. I'm guessing this is because containerregistry fails for large blobs when it buffers everything into memory? Are there any other reasons? If you are intent on continuing to use python, I think what you're doing is reasonable, but if the language is negotiable, there is a maintained successor to containerregistry at github.com/google/go-containerregistry, which would save you some trouble.
Some implementation details that could be improved:
- This _put_blob_single_post makes sense as an optimization, but as far as I can tell, you are the only client that uses this method of uploading blobs. I would like to deprecate this, as it's frustrating to support on our end, but I don't want to break anything. Would you be willing to use the two-request form of monolithic uploads? I.e. a POST to initiate the upload, then a single PUT with all the data, as described here.
- I haven't traced through all the code, but I don't see any blob existence checks (i.e.
HEAD /v2/.../blobs/<digest>
) before uploading blobs. Doing that should save a lot of work, most of the time. - Similarly, you don't seem to take advantage of cross-repo mounting in e.g. replicate_artifact.
- Looking at one of these images (expand the
Manifest
dropdown), it seems like the media types are getting mangled. This isn't a huge problem, but I can see it causing issues for lots of clients. - It doesn't appear that you're setting a user-agent anywhere. It would have been much easier to find this repo if this had a unique user-agent :P and in general it would be nice to be able to identify requests from this client vs other clients written in python, for debugging purposes.
Again, I mostly care about that first point, the rest are mostly just me trying to be helpful 🙂 I can send a PR if you'd like but it might be easier for you to do it yourself? Also, have you looked into how hard it would be to patch containerregistry to support streaming blobs? That might be easier, but probably not.