Skip to content

layout: sharding the blob store #449

Closed
@cyphar

Description

One issue that I'm quite worried about is the performance impact of having too many blobs inside an OCI image. Now, practically speaking I would be surprised if n > 20 in most cases, but some people have expressed that they would like to have the entire universe bottled into an OCI image. I will refrain from commenting on how good of an idea I think that is, but if it's going to be a "valid usecase" then we should reconsider how we've organised the blob directory.

Namely, the current method of blobs/<algo>/<digest> will cause problems if the number of digests becomes quite large, due to implementation issues of filesystems. Essentially all filesystems are not designed to handle accesses of directories with many files well. If you look at how git, camlistore and many other such projects implement their blob storage it looks more like blobs/<algo>/<prefix>/<suffix> (or in camlistore's case, three sets of <prefix>/).

Naturally this would be a backwards incompatible change (you can't really implement this scheme as well as retaining the old one because then you have an exponential number of ways to read the same blob data, almost certainly leading to countless implementation bugs). So we should probably consider this for post-1.0.0.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions