Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Extend OCI Artifact Types in Runtime #12013

Closed
gaocegege opened this issue May 23, 2020 · 12 comments
Closed

[feature request] Extend OCI Artifact Types in Runtime #12013

gaocegege opened this issue May 23, 2020 · 12 comments
Assignees
Labels
area/OCI kind/requirement New feature or idea on top of harbor

Comments

@gaocegege
Copy link

gaocegege commented May 23, 2020

Is your feature request related to a problem? Please describe.

Harbor supports three OCI Artifact types: OCI Image, Helm Chart and CNAB, by default. When users want to use Harbor to store/publish/share new artifacts (e.g. Machine Learning Models), they have to fork Harbor and implement the processor logic in goharbor/harbor.

It works but there are huge operational costs.

Describe the solution you'd like

We expect that the processor is extensible, like Kubernetes scheduler-extender or Harbor scanner. Users could implement their own processor outside Harbor core.

Describe the main design/architecture of your solution

The harbor core can communicate with the remote processor via IP:Port or unix domain socket or something else. We are going to submit a detailed proposal to Harbor community soon.

Describe the development plan you've considered

We, at Caicloud, can help implement it. And we are glad to help the community maintain the feature.

/cc @hainingzhang @steven-zou

/assign @gaocegege @hyy0322 @zhujian7

@reasonerjt
Copy link
Contributor

I personally think it may be too heavy to create a service to solely for extracting data, create a workflow to manage the plugins are more complicated than adding code to run in harbor-core,
and I don't see a lot use cases in addition to the learning model. but I'm certainly open for more discussion...

@gaocegege
Copy link
Author

gaocegege commented May 25, 2020

@reasonerjt Thanks for your comment.

@reasonerjt I don't see a lot use cases in addition to the learning model. but I'm certainly open for more discussion...

We do not have other OCI artifact types now, but I think the feature will be generally adopted in the future. As you know, there are many other artifacts that can be stored in the OCI-based registry.

Caicloud is a small start-up, but we already have the brisk demand for this feature. We will use the registry to store not only ML/DL models but also datasets, our proprietary application bundle, and so on. Thus we think that it should be popular in the foreseeable future.

And, from the perspective of the artifact authors, we can store user-defined artifact types now but the information about the artifact is not self-contained. We have to fork Harbor-core and implement the processor logic in the fork. If we decide to contribute the logic to the Harbor community, we have to commit into Harbor core and follow the version release process of Harbor, which is not necessary for both Harbor and the artifact authors.

Harbor is claimed to be the first OCI-compliant open-source registry, and we say that:

As artifact types will undoubtedly come and go, it’s crucial that Harbor exists outside of any particular container format, and be flexible enough to onboard and discard any artifact type based on community demand and adherence to common standards.

Thus I think extensibility should be provided to the artifact authors.

@reasonerjt I personally think it may be too heavy to create a service to solely for extracting data, create a workflow to manage the plugins are more complicated than adding code to run in harbor-core

In our expectation, the three types Helm Chart, CNAB, and Image should be kept in Helm core. When there are new non-standard types such as ML/DL models or some proprietary types, we can provide a mechanism to extend Harbor outside Harbor-core.

As for the detailed design and implementation, I think we can have a further discussion on it when the proposal is submitted. I do agree that we do not want the feature is too heavy, and we do not want to affect the current workflow.

@bitsf bitsf added the kind/requirement New feature or idea on top of harbor label May 26, 2020
@steven-zou steven-zou self-assigned this May 26, 2020
@steven-zou
Copy link
Contributor

I think this is a valuable feature request. One more thing we should clarify here is, this proposal does not aim to add more artifact metadata extractors to Harbor, it (the proposal) is trying to provide a capability to let Harbor easily support user-defined artifact kinds with rich metadata format. It will not cause any negative influences to the harbor default supporting artifacts kinds(image, helm v3, CNAB, OPA bundle). It only opens the door to let harbor have certain extent extensibility. The adopter can decide whether they want to leverage this extensibility to support their own artifacts kinds or not.

@xaleeks
Copy link
Contributor

xaleeks commented May 26, 2020

@gaocegege Thanks for the idea and it's great to see harbor being used for hosting common artifacts used in machine learning projects like Kubeflow. Being that there is no dedicated registry geared towards AI/ML on Kubernetes on the market, its awesome to see that good access control and lifecycle management capabilities along with OCI support makes Harbor a good candidate.

I think it's a good idea that we outsource that ability to capture detailed metadata to the different artifact authors. Right now you can push anything to it but none of the metadata comes through. Can you have a proposal ready for discussion by the next community meeting?

@gaocegege
Copy link
Author

gaocegege commented May 26, 2020

it's great to see harbor being used for hosting common artifacts used in machine learning projects like Kubeflow.

Yeah, We are glad to contribute our model specification to Kubeflow when it is mature.

Can you have a proposal ready for discussion by the next community meeting?

Yeah, I will submit the proposal this Friday or next Monday with technical details.

@xaleeks
Copy link
Contributor

xaleeks commented May 26, 2020

@gaocegege that's great, looking forward to hearing more on the community meeting next Wed :)

@hyy0322
Copy link
Contributor

hyy0322 commented May 31, 2020

@gaocegege
Copy link
Author

Ref goharbor/community#143

@gaocegege
Copy link
Author

After the discussions in the community call yesterday, we will add the technical details about the in-tree implementation in our design proposal. It's WIP.

@xaleeks
Copy link
Contributor

xaleeks commented Jul 1, 2020

Seems we might be able to deliver this in the v2.1 time frame, soft-tagging this 2.1 to keep track. Really appreciate the help here! @gaocegege

@gaocegege
Copy link
Author

Thanks for the community. Things we need to do next:

  • Support addition layers in custom artifacts.

@steven-zou
Copy link
Contributor

This feature has been delivered in the Harbor V2.1 release.

Close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/OCI kind/requirement New feature or idea on top of harbor
Projects
None yet
Development

No branches or pull requests

6 participants