Skip to content

Conversation

vsoch
Copy link
Collaborator

@vsoch vsoch commented Jan 15, 2024

This proposal is focused on a simple design for metadata about compatibility within either an existing manifest (or list) or a newly created artifact. It describes a plugin architecture and namespaced attributes (the metadata) that are maintained by compatibility interest groups, and a plugin framework that includes plugins for extracting, checking, and creation, and within each flexibility for simple compatibility checks (e.g., key/value pair matching) or more complex graph-based approaches.

@vsoch vsoch changed the title add: proposal C for working compatibility group Proposal C: Namespaced Compatibility Metadata Maintained by Compatibility Communities Jan 15, 2024
This proposal is focused on a simple design for metadata about compatibility
within either an existing manifest (or list) or a newly created artifact.
It describes a plugin architecture and namespaced attributes (the metadata)
that are maintained by compatibility interest groups, and a plugin framework
that includes plugins for extracting, checking, and creation, and within
each flexibility for simple compatibility checks (e.g., key/value pair matching)
or more complex graph-based approaches.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
- Compatibility interest groups can develop artifacts that point TO the image (there is no required approval / permission to develop some niche compatibility definition).
- **Cons**:
- Updating compatibility means also updating the manifest or list
- Compatibility interest groups need to advocate to get their metadata added (unlikely / challenging)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more cons. : the host has to run 3rd plugins which may introduce security risk to host, especially in production environment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChaoyiHuang if the plugins are libraries integrated into the runtime tools, wouldn't this just be part of the tool (and thus vetted by it)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the plugins are libraries integrated into the runtime tools, that would mean each time there is a new plugin that I am interested in, I have to recompile the tool? Or do you mean a library loaded at runtime?

Considering the first.. If we maintain an official tool that validates the compatibility over plugins, do we then force users to fork the tool and add their own plugins?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is probably room for creativity here too - I could imagine several designs. The first is based on common patterns of checking and then feeding in yaml configs. For example, the graph tool that I am prototyping takes in both the structure of the graph and the specific entity you want to see if there is a match for as yaml or json. A simple “match these key value pairs” would do the same. So in this case of “common patterns of checking” the plugin would more so represent a pattern and the user would point at some directory of configs for that. You wouldn't need to rebuild anything.

The more embedded library use case (assuming working in Go) might look like the Kubernetes scheduler, where you have a bunch of named plugins that each expose different interfaces and can change / enable / disable them on the fly via profiles with the config yaml file given to the run or start command. And no rebuild needed. For that case you’d likely have the plugins built either separately as binaries or into the tool. There is also RPC as an idea!

The third case could be like a traditional library in Go, and runtime tools would add the modules and checks they care about, packages with releases (no rebuild). That seems like it would be most appropriate for atomic plugins that are tested / vetted before releases.

Likely the sweet spot is somewhere between those two extremes - having a component or interface built into the tool that is customized with configs and only warrants rebuild or change with new releases of the tool, in the same way we have new releases for new features or bug fixes.

- Empowers compatibility interest groups (even niche) to create artifacts for images
- **Cons**:
- Can easily be separated from the original manifest / image (would require tooling to check for references)
- Requires additional queries
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more cons. : not all registries support OCI image spec 1.1, has to consider fallback method to check whether compatibility artifact existing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this point? Which particular aspect of 1.1 is not supported and what would be the fallback? Arguably if we add some other support for annotations directly in a manifest that would also be a new type that maybe is not supported.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example quay does not support referrer api yet, one issue was open, but no bandwidth to finish it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand! So any approach that wants to rely on a separate artifact (using the referers API) would still need a fallback method to check. Likely the simplest thing would be storing the artifact with some known identifier (e.g., :tag -> tag-compat) but that's a bit janky. Worst case, the user / tool is in charge of matching images to artifacts (not linked officially) to check. But the simple approach of storing it in the manifest might work, of course assuming the format of the manifest with said metadata is supported too. It's a hard problem indeed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the simple approach of storing it in the manifest might work

do you mean storing in image manifest?

I have an idea to address the issue if registry does not support referrer api(the fallback method):

  1. the image compatibility tool fetches image index first
  2. iterate descriptor in the manifests list to pull each manifest
  3. check the manifest artifact type, if it's compatibility artifact, then check the subject whether the referred image is the one the tool want to check/validate

through above process, it should be able to find the right compatibility artifact.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker Hub doesn't support it either.


Depending on the distribution path chosen, additional discussion can be added to this proposal about the following points:

- Artifact discovery: in the case of choosing an artifact, if a registry supports the referrers API it should be used for discovery. Otherwise, artifacts need to be pulled directly via a known unique resource identifier.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, artifacts need to be pulled directly via a known unique resource identifier.

That can be a tag.


### Image Manifest

The JSON object could go directly in the image manifest or list to not require any additional artifact query or queries. For runtime tools, this would mean (still) just one call to retrieve all needed metadata to do a check. There would be no additional calls to retrieve plugins. Any compatibility checking plugins (discussed later in this proposal) would already be vetted and installed alongside the runtime tools.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you consider container runtimes (high-level, as containerd, docker etc.) as part of the runtime tools category?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why not.

Copy link
Collaborator

@mfranczy mfranczy Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall that somebody in the meeting said that container runtime implementations may reject too big manifests. I am afraid if we put compatibility requirements in the index or image manifest then it may be rejected. This can be discussed on the meeting.

@Toasterson
Copy link

I think we have adress two different topics with the different specs. One topics is what we want to read and the other how we read it.

@vsoch
Copy link
Collaborator Author

vsoch commented Jan 16, 2024

I think we have adress two different topics with the different specs. One topics is what we want to read and the other how we read it.

Yep, totally agree! In my design above, the "what we want to read" is an extractor plugin (reading information from the environment) and that is up to the jurisdiction of a plugin developer (which might actually be us for these core cases) and how we want to read it is the design of that extractor plugin (e.g., returning back to some checking plugin a standard format of metadata to parse).

Copy link
Collaborator

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking comment since this is just a proposal.

Nit: one sentence per line is useful for PR's (commenting on a line or viewing diffs).

It would be good to include the intended usage of this spec, for node provisioning, node selection, or image selection at runtime.

I think it would be good to leave this marked as a draft until we get the use cases documented, and then we can describe how this handles the various use cases in the same PR.

- Maintained by a compatibility interest group
- Method needs to return a boolean outcome "yes" or "no"
- Metadata needs to be namespaced to a prefix owned by a CIG.
- A **Namespace** is a named identifies that provides an organization under which users can specify compatibility objects.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comments on the other proposal, "namespace" is a pretty overloaded term already in the container ecosystem. It would be good to find a different name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had prefix before, and I changed to namespace to mirror proposal B. I am definitely good with another term - I actually think "compatibility interest group identifier" is specific (and exactly what we are talking about) and would work here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about use "topic", "theme" or "context" instead of namespace

- Does not require additional query
- Cannot get separated from the image manifest or list
- Compatibility interest groups can develop artifacts that point TO the image (there is no required approval / permission to develop some niche compatibility definition).
- **Cons**:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another con: manifests have a recommended upper size limit of 4MB, so the spec content needs to be kept small to avoid exceeding that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about this too. I am thinking I like the approach of separation of concern - "atomic" compatibility identifiers (a small set) might go in manifests, and everything else might be external / elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my previsous comment is to extend CPU features in platform.features, and some additional factors which will impact image selection can be some new fields in platform. for others, if it is nothing to do with image selection, it's better to describe in external artifact

- Empowers compatibility interest groups (even niche) to create artifacts for images
- **Cons**:
- Can easily be separated from the original manifest / image (would require tooling to check for references)
- Requires additional queries
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker Hub doesn't support it either.


### Image Manifest _and_ Compatibility Artifact

We could get the best of both worlds by taking both approaches, and allowing the compatibility artifact to live in either place:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to avoid this because it doubles the complexity needed in tooling, creates multiple code paths that need to be tested, and create ambiguity if both options are used with conflicting data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was just a suggestion because I think we are talking about a common JSON object, but where you put it depends on the importance of it (atomic vs. community maintained). Perhaps we can find a balance, because each has pros and cons.


### Simple Example

An example JSON object with key value pairs is provided below:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about intermixing dots in the name to indicate a hierarchical structure with a hierarchical structure itself. JSON parsers I've used aren't going to automatically extrapolate that structure and instead put the dots in the key field. That means tooling would need to check for an "org", "org.supercontainers", and "org.supercontainers.hardware" key to see if a setting is defined. I'd lean towards either fully hierarchical or completely flat to avoid ambiguity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for more hierarchical. The completely flat approach means we need to do extra parsing to get out groups of things.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Collaborator Author

vsoch commented Feb 13, 2024

Closing in favor of D, #9, which includes aspects of C here but is an improvement in most respects, I think.

@vsoch vsoch closed this Feb 13, 2024
@ChaoyiHuang ChaoyiHuang mentioned this pull request Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants