Proposal D: Namespaced and Graph-based Compatibility Metadata Schema #9

vsoch · 2024-01-30T02:38:19Z

This proposal defines a simple, easy to read and understand compatibility spec that describes metadata attributes for compatibility. It can be paired with a compatibility schema that is maintained by a compatibility interest group, for which the goal is to define the namespace of allowed metadata attributes and relationships between them for the artifact. For the latter, the format is JGF "Json Graph Format" and so no new structure needs to be proposed by the working group. These two documents (the schema and artifact) are complementary and would work together to allow for validation and understanding of relationships between terms, but without adding complexity to the compatibility artifact directly. The expected use cases are image selection and scheduling, both of which I am prototyping and actively running experiments for.

docs/proposals/PROPOSAL_D.md

mfranczy · 2024-01-30T15:35:28Z

I will wait to review further until you update the proposal as you mentioned in the Slack channel.

mfranczy · 2024-01-30T15:45:20Z

Although I have one more question... Where the schema and plugins provided by organisation would live? Do you consider a central repo under OCI or dedicated for specific organisations?

vsoch · 2024-01-30T16:14:15Z

Although I have one more question... Where the schema and plugins provided by organisation would live? Do you consider a central repo under OCI or dedicated for specific organisations?

Either of these cases - right now I'm storing them at https://github.com/supercontainers/compspec and they are referenced in the generated artifacts shown here.

sudo-bmitch

I'm definitely concerned that this won't work for runtimes.

At a higher level, I worry that multiple conflicting graphs could be used to deploy workloads on nodes that are unexpected. If different tools parse different parts of the spec, ignoring the parts they don't understand, an attacker could leverage that to sneak a workload onto a cluster bypassing various scanners and checks. This exists with all the proposals, but increases in risk with complexity.

I'd also avoid including the schema in the generated json if it's not needed to parse the input. And if it is needed to parse it, then runtimes cannot work when airgapped, and images will break when a 3rd party service has an outage.

vsoch · 2024-01-30T21:44:10Z

I'm definitely concerned that this won't work for runtimes.

I know this wasn't liked, but I do think we need two separate things here.

vsoch · 2024-02-02T03:14:28Z

Proposal is updated! This is a round 1 update because I have not yet considered the TODO in this issue, needing to represent relationships for preferences in the spec itself.

docs/REQUIREMENTS.md

sudo-bmitch · 2024-02-11T15:32:46Z

docs/proposals/PROPOSAL_D.md

+
+- [x] As a system runtime administrator, I want to check whether a container is compatible with the nodes I am going to run it on using the provided tool.
+- [x] As a system runtime administrator, I would like to fetch additional documentation for understanding specific settings in the compatibility spec.
+- [x] As a system runtime administrator, selecting which image to run should only require pulling the Index manifest, and parsing the descriptors listed.


I believe this one is mutually exclusive with line 407.

I just copy pasted the contents from the requirements file.

This was a reference to checking the item. If we say "I want to update compatibility independently without having to re-release and re-distribute my image" is provided by this implementation, then I don't think we can also say a runtime can select an image with only the Index manifest. Runtimes would need to pull the associated referrers to support that.

I checked it because you still technically could - it would work as it does now.

@sudo-bmitch I removed this box, because the implication is "I want to get compatibility information only using the index" and not "I want to still be able to select an image" (how I read it).

sudo-bmitch · 2024-02-11T15:40:09Z

docs/proposals/PROPOSAL_D.md

+
+### Security Administrator
+
+- [x] As a security administrator, I want predictable behavior from runtimes, which does not change based on unsigned content.


Does this require the compatibility artifact to also be signed?

I just copy pasted the contents from the requirements file.

What is the runtime behavior if the image is signed, but the compatibility artifact is not?

If the two are pushed from the same build CI I think this case would be unlikely. But if it happened, likely the runtime would not use it. Thankfully in HPC land we rarely do proper signing and checking of things, it's an ideal more than anything else.

For the items here that generated discussion, I think it's useful to capture the thought process around a check (or lack there of) with a comment for those viewing the merged proposal later. In the other proposals, we've been placing those _(inside parenthesis and with italics)_.

For this item, I'd add "runtimes should ignore unsigned or untrusted artifacts if signed images are required, even if the image itself is signed by a trusted authority".

You got it!

mfranczy

The proposal is definetely interesting. I think we have to discuss image selection influenced by compatibility defined in the artifact. There are many concerns around that.

docs/proposals/PROPOSAL_D.md

vsoch · 2024-02-13T15:18:14Z

Proposal is updated to include plugin design (not required, but introspection for future work) and an explicit answer to the question about about needing graphs.

ChaoyiHuang · 2024-02-20T07:54:39Z

docs/proposals/PROPOSAL_D.md

+        "cpu.vendor": "GenuineIntel"
+      }
+    },
+    {


if there are mutiple compatibilities list here, should meet all compatibilities, or just meet one of the compaibility, or some of them.

That is up to the tool using the artifact. The metadata is provided with flexibility in mind.

docs/proposals/PROPOSAL_D.md

vsoch · 2024-02-20T08:34:57Z

Note for the working group that I started a (more properly plugin based) Python module tonight, and moved compspec-go there as well: https://github.com/compspec/. I'll eventually put more about the specification we decided upon under spec, and likely write some nice tools to make graphs and other visualizations (web and static) there.

It doesn't well belong under supercontainers because a compatibility specification can be used to describe other kinds of applications (binaries for the HPC use case). I'll be developing this library more this week, but for an example, let's say we have application metadata about I/O needs via IOR. My plan would be to allow to install the plugin and main library:

pip install compspec
pip install compspec-ior

And then the extraction UI would be similar to go, something like:

compspec-py extract --name ior ...

And the main library would discover the modules akin to names, like how we did in snakemake. Any library could write a simple interface (that would be well defined) to work with the main library to plug-in to (likely) still compspec-go that can be used in "all the go places that containers like to be."

For some background on that original compspec, before I started the converged computing work at the lab I was a bit bored and got hugely into answer set programming (ASP) and wrote this generic library (that used it) "to compare things." ASP (with clingo) is actually the base of the solver in spack. it's not the speediest thing (If I needed to write one I'd use rust) but it's kind of fun as an exercise to write these little programs.

I'm off to bed - more on this in the coming weeks.

mfranczy · 2024-02-20T12:39:29Z

Is that only information about experiment or should we treat that as something related to the proposal itself?

For now, I am assuming the latter.

Any library could write a simple interface (that would be well defined) to work with the main library to plug-in to (likely) still compspec-go

Do I understand correctly that you want to allow plugins to be developed in any language? If yes, then a few questions (some of them I also got in my proposal)

Who would maintain the libraries?
In the example you developed plugins with Python. What if I cannot install Python on the host because of very strict environment and still want to use plugins developed by some org?
How do you make sure that plugins are secure for consumers?
In the example you used pip to distribute plugins, if I use different language, the main library would have to find a plugin over executable name that has to be added to the $PATH? Additionally, how do I verify the plugins?
Do you also plan to use plugins (for instance extractors) to generate node labels that can be later matched for scheduler?

vsoch · 2024-02-20T14:12:51Z

Is #9 (comment) only information about experiment or should we treat that as something related to the proposal itself?

It's a quick and easy example that I am empowered to build things using it!

Do I understand correctly that you want to allow plugins to be developed in any language?

Given that the artifacts can be used for off table use cases, I don't see why not.

Who would maintain the libraries?

Whomever has a vested interest to, communities, companies, individuals, it doesn't matter.

In the example you developed plugins with Python. What if I cannot install Python on the host because of very strict environment and still want to use plugins developed by some org?

You could ask this about any language. Generally speaking if something isn't allowed there needs to be a creative way to still run it (e.g., somewhere else) or go to leadership and argue the case and inspire change. This was the container story in HPC - nobody let us run them on clusters at first.

How do you make sure that plugins are secure for consumers?

How do you make sure any software is secure for consumers?

In the example you used pip to distribute plugins, if I use different language, the main library would have to find a plugin over executable name that has to be added to the $PATH?

You use whatever package manager makes sense. If you need that level of checking you'd probably want to verify the sha. And for the second question, every language has a different strategy for plugins - Python's just happens to be more flexible. Nushell is a cool example that allows for any language.

Additionally, how do I verify the plugins?

However you decide to.

I don't know what a lot of these questions have to do with the proposal, or any proposal here. We design the compatibility artifact, and people are empowered to build things with it. The things they build are not under the control / decision of us here, but have to grow organically to adopt good practices.

Do you also plan to use plugins (for instance extractors) to generate node labels that can be later matched for scheduler?

I already am.

mfranczy · 2024-02-20T14:37:07Z

I don't know what a lot of these questions have to do with the proposal, or any proposal here. We design the compatibility artifact, and people are empowered to build things with it. The things they build are not under the control / decision of us here, but have to grow organically to adopt good practices.

This has a lot to do with the proposals presented here. We don't only design the compatibility artifact, but also the way how it can be used later if we decide to release an official OCI tool or libraries for that. Especially, if we think to enable that for container runtimes for image selection use case. That's why those were asked to find out if you have some nice ideas for that.

My questions were only to find out about some methods how we could build stuff around the tool. Not to jeopardize the proposal. If you propose that users can do anything then I have no more questions. Let it be.

vsoch · 2024-02-20T14:44:30Z

This has a lot to do with the proposals presented here. We don't only design the compatibility artifact, but also the way how it can be used later if we decide to release an official OCI tool or libraries for that. Especially, if we think to enable that for container runtimes for image selection use case. That's why those were asked to find out if you have some nice ideas for that.

I think we should scope plugin discussion to some phase II of our group work - arguably if we make plugins for any artifact (or non-artifact) spec we will have similar questions. The plugin design here was an attempt to think through some of my ideas and anticipate that, but doesn't need to be considered formally part of the proposal. My main reason was that I was going to start working on tools / plugins and wanted to write down the design. It might help to scope initial discussion to just the "json parts" and acknowledge the desire for plugins (and come back to it).

ChaoyiHuang · 2024-02-21T02:37:03Z

docs/proposals/PROPOSAL_D.md

+    "type": "compspec",
+    "label": "compatibilities",
+    "nodes": {
+      "mpi": {


another question about the schema and compatibility artifact separation. because compatibilities are relied on the schema, so if lots of compatibility artifact have been delivered to production environment, which refer to the aleady published schema version. But unfortunately, if someday a big issue found in schema, and fixed with a new version, then all big issues(the relationships) are still used in the deliverered compatibility artifacts, if we want to fix the issue, all compatibility artifacts have to be re-released using new schema.

From this perspective, self-contained schema (customizable node relationships) and compatiblity will reduce the issue propagation, and reduce the fix cost

That's generally how it works with software too though. It's better to have versioning than not I think. Technically speaking, if you just ignore the schema (and version) you could have a "self-contained" schema. Having the entire schema within the artifact is not reasonable from a practicality standpoint. For an example, here is just the start of IOR, it doesn't even include output types yet. compspec/schemas@764520d

And that's just one namespace (I/O) there would be multiple defined for one file. It's huge redundancy for very little benefit IMHO. And if there were some issue with the schema, instead of updating one place you'd still need to update the many (thousands?) of artifacts instead. We likely just need some way to patch / give directive to those using the old schema.

@ChaoyiHuang you make a good point for why we don't want to embed logic for the actual choice in the artifact, because there could be some "big issue" - the implication being that it is with the logic of the selection. That is why I advocate for an approach where the compatibility specification is just that - an artifact with information, and the way that information is used to decide on image selection (the algorithm / logic) is not hard coded there. That's the main way you'd run into some issue like you are describing. If it's just adding / removing fields that is much less likely to warrant some crisis.

if we want to fix the issue, all compatibility artifacts have to be re-released using new schema.

That's also the case with any of these proposals. If there is a change to anything (field, logic, etc) that warrants a change to the artifact or image manifest. I would argue my approach is more flexible to that because often you can change the schema and then have some way to say "support previous versions" and there is no need to touch the artifacts. The other proposals, if they require everything hard coded into the artifact, cannot support that. :)

the question is about to build the artifact compatibilities specific relationship in schema:

if first io.archspec is about CPU, and the second one is about mpi and GPU, how to express the compatibility requirements: 1) cpu GenuineIntel amd64 + mpi v1.1 + nvidida GPU + (nvidia infiniband or arista infiband), 2) cpu GenuineAMD amd64 + mpiv1.2 + AMD GPU + arista infiniband)

That is also up to the tool. The relationships between things are defined by the upper level schema, if that is desired, but it doesn't have to be used

These kind of comaptibilities and/or relationship and combinations is image specific, and easy to change than standard. If they were built into schema, the problem is what I mentioned in this question.

We can disagree then. Thanks for your feedback.

I'll note that the metadata values themselves are still in the artifact. It's just the namespace, declarations, and relationships that are in the schema. There is exactly the same metadata in the artifact here than there is in, for example, proposal A, but it's extended to be much more useful in scenarios beyond "Match this one tag."

Your argument is also akin to saying we should put the schema for an sbom in every sbom because it might change. That doesn't make sense to me.

Proposal D is an extension to Proposal C. Proposal C defines an explicit example of a compatibility artifact, meaning what a single artifact would look like paired alongside an image in a registry (in some way) to describe its compatibility for image selection or similar. Proposal D defines a compatibility schema that is maintained by a compatibility interest group, for which the goal is to define the namespace of allowed metadata attributes and relationships between them. These two proposals are complementary and would work together to allow for validation and understanding of relationships between terms, but without adding complexity to the compatibility artifact (Proposal C) directly. Signed-off-by: vsoch <vsoch@users.noreply.github.com>

vsoch requested review from ChaoyiHuang, cyphar, mfranczy, neersighted and sudo-bmitch as code owners January 30, 2024 02:38

vsoch force-pushed the proposal-d branch 3 times, most recently from bd7fe02 to 24fbc14 Compare January 30, 2024 02:57

mfranczy reviewed Jan 30, 2024

View reviewed changes

docs/proposals/PROPOSAL_D.md Outdated Show resolved Hide resolved

docs/proposals/PROPOSAL_D.md Show resolved Hide resolved

docs/proposals/PROPOSAL_D.md Outdated Show resolved Hide resolved

sudo-bmitch reviewed Jan 30, 2024

View reviewed changes

vsoch mentioned this pull request Feb 2, 2024

add graph prototype compspec/schemas#6

Merged

sudo-bmitch reviewed Feb 11, 2024

View reviewed changes

vsoch force-pushed the proposal-d branch from c9200a5 to b6408a6 Compare February 11, 2024 17:02

mfranczy reviewed Feb 13, 2024

View reviewed changes

vsoch mentioned this pull request Feb 13, 2024

Proposal C: Namespaced Compatibility Metadata Maintained by Compatibility Communities #8

Closed

vsoch force-pushed the proposal-d branch from ea00189 to 2d6af72 Compare February 13, 2024 15:28

ChaoyiHuang reviewed Feb 20, 2024

View reviewed changes

docs/proposals/PROPOSAL_D.md Show resolved Hide resolved

vsoch force-pushed the proposal-d branch from 2d6af72 to 9dc0def Compare February 20, 2024 08:29

vsoch mentioned this pull request Feb 20, 2024

Dinosaur TODO compspec/compspec#18

Closed

8 tasks

mfranczy previously approved these changes Feb 20, 2024

View reviewed changes

ChaoyiHuang reviewed Feb 21, 2024

View reviewed changes

vsoch dismissed mfranczy’s stale review via 0ffee93 February 21, 2024 02:45

vsoch force-pushed the proposal-d branch from 9dc0def to 0ffee93 Compare February 21, 2024 02:45

mfranczy previously approved these changes Feb 21, 2024

View reviewed changes

vsoch dismissed mfranczy’s stale review via f0e4d35 February 21, 2024 18:40

vsoch force-pushed the proposal-d branch from 0ffee93 to f0e4d35 Compare February 21, 2024 18:40

sudo-bmitch approved these changes Feb 21, 2024

View reviewed changes

mfranczy approved these changes Feb 21, 2024

View reviewed changes

mfranczy merged commit 1fddd9d into opencontainers:main Feb 21, 2024


		### Security Administrator

		- [x] As a security administrator, I want predictable behavior from runtimes, which does not change based on unsigned content.

Proposal D: Namespaced and Graph-based Compatibility Metadata Schema #9

Proposal D: Namespaced and Graph-based Compatibility Metadata Schema #9

Uh oh!

Conversation

vsoch commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mfranczy commented Jan 30, 2024

Uh oh!

mfranczy commented Jan 30, 2024

Uh oh!

vsoch commented Jan 30, 2024

Uh oh!

sudo-bmitch left a comment

Choose a reason for hiding this comment

Uh oh!

vsoch commented Jan 30, 2024

Uh oh!

vsoch commented Feb 2, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mfranczy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vsoch commented Feb 13, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vsoch commented Feb 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mfranczy commented Feb 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vsoch commented Feb 20, 2024

Uh oh!

mfranczy commented Feb 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vsoch commented Feb 20, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

vsoch commented Jan 30, 2024 •

edited

Loading

vsoch commented Feb 20, 2024 •

edited

Loading

mfranczy commented Feb 20, 2024 •

edited

Loading

mfranczy commented Feb 20, 2024 •

edited

Loading

vsoch Feb 21, 2024 •

edited

Loading