-
Notifications
You must be signed in to change notification settings - Fork 5
Proposal D: Namespaced and Graph-based Compatibility Metadata Schema #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bd7fe02
to
24fbc14
Compare
I will wait to review further until you update the proposal as you mentioned in the Slack channel. |
Although I have one more question... Where the schema and plugins provided by organisation would live? Do you consider a central repo under OCI or dedicated for specific organisations? |
Either of these cases - right now I'm storing them at https://github.com/supercontainers/compspec and they are referenced in the generated artifacts shown here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm definitely concerned that this won't work for runtimes.
At a higher level, I worry that multiple conflicting graphs could be used to deploy workloads on nodes that are unexpected. If different tools parse different parts of the spec, ignoring the parts they don't understand, an attacker could leverage that to sneak a workload onto a cluster bypassing various scanners and checks. This exists with all the proposals, but increases in risk with complexity.
I'd also avoid including the schema in the generated json if it's not needed to parse the input. And if it is needed to parse it, then runtimes cannot work when airgapped, and images will break when a 3rd party service has an outage.
I know this wasn't liked, but I do think we need two separate things here. |
Proposal is updated! This is a round 1 update because I have not yet considered the TODO in this issue, needing to represent relationships for preferences in the spec itself. |
docs/proposals/PROPOSAL_D.md
Outdated
- [x] As a system runtime administrator, I want to check whether a container is compatible with the nodes I am going to run it on using the provided tool. | ||
- [x] As a system runtime administrator, I would like to fetch additional documentation for understanding specific settings in the compatibility spec. | ||
- [x] As a system runtime administrator, selecting which image to run should only require pulling the Index manifest, and parsing the descriptors listed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this one is mutually exclusive with line 407.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copy pasted the contents from the requirements file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a reference to checking the item. If we say "I want to update compatibility independently without having to re-release and re-distribute my image" is provided by this implementation, then I don't think we can also say a runtime can select an image with only the Index manifest. Runtimes would need to pull the associated referrers to support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked it because you still technically could - it would work as it does now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sudo-bmitch I removed this box, because the implication is "I want to get compatibility information only using the index" and not "I want to still be able to select an image" (how I read it).
docs/proposals/PROPOSAL_D.md
Outdated
### Security Administrator | ||
- [x] As a security administrator, I want predictable behavior from runtimes, which does not change based on unsigned content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require the compatibility artifact to also be signed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copy pasted the contents from the requirements file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the runtime behavior if the image is signed, but the compatibility artifact is not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the two are pushed from the same build CI I think this case would be unlikely. But if it happened, likely the runtime would not use it. Thankfully in HPC land we rarely do proper signing and checking of things, it's an ideal more than anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the items here that generated discussion, I think it's useful to capture the thought process around a check (or lack there of) with a comment for those viewing the merged proposal later. In the other proposals, we've been placing those _(inside parenthesis and with italics)_
.
For this item, I'd add "runtimes should ignore unsigned or untrusted artifacts if signed images are required, even if the image itself is signed by a trusted authority".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal is definetely interesting. I think we have to discuss image selection influenced by compatibility defined in the artifact. There are many concerns around that.
Proposal is updated to include plugin design (not required, but introspection for future work) and an explicit answer to the question about about needing graphs. |
"cpu.vendor": "GenuineIntel" | ||
} | ||
}, | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are mutiple compatibilities list here, should meet all compatibilities, or just meet one of the compaibility, or some of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is up to the tool using the artifact. The metadata is provided with flexibility in mind.
Note for the working group that I started a (more properly plugin based) Python module tonight, and moved compspec-go there as well: https://github.com/compspec/. I'll eventually put more about the specification we decided upon under spec, and likely write some nice tools to make graphs and other visualizations (web and static) there. It doesn't well belong under supercontainers because a compatibility specification can be used to describe other kinds of applications (binaries for the HPC use case). I'll be developing this library more this week, but for an example, let's say we have application metadata about I/O needs via IOR. My plan would be to allow to install the plugin and main library: pip install compspec
pip install compspec-ior And then the extraction UI would be similar to go, something like: compspec-py extract --name ior ... And the main library would discover the modules akin to names, like how we did in snakemake. Any library could write a simple interface (that would be well defined) to work with the main library to plug-in to (likely) still compspec-go that can be used in "all the go places that containers like to be." For some background on that original compspec, before I started the converged computing work at the lab I was a bit bored and got hugely into answer set programming (ASP) and wrote this generic library (that used it) "to compare things." ASP (with clingo) is actually the base of the solver in spack. it's not the speediest thing (If I needed to write one I'd use rust) but it's kind of fun as an exercise to write these little programs. I'm off to bed - more on this in the coming weeks. |
Is that only information about experiment or should we treat that as something related to the proposal itself? For now, I am assuming the latter.
Do I understand correctly that you want to allow plugins to be developed in any language? If yes, then a few questions (some of them I also got in my proposal)
|
It's a quick and easy example that I am empowered to build things using it!
Given that the artifacts can be used for off table use cases, I don't see why not.
Whomever has a vested interest to, communities, companies, individuals, it doesn't matter.
You could ask this about any language. Generally speaking if something isn't allowed there needs to be a creative way to still run it (e.g., somewhere else) or go to leadership and argue the case and inspire change. This was the container story in HPC - nobody let us run them on clusters at first.
How do you make sure any software is secure for consumers?
You use whatever package manager makes sense. If you need that level of checking you'd probably want to verify the sha. And for the second question, every language has a different strategy for plugins - Python's just happens to be more flexible. Nushell is a cool example that allows for any language.
However you decide to. I don't know what a lot of these questions have to do with the proposal, or any proposal here. We design the compatibility artifact, and people are empowered to build things with it. The things they build are not under the control / decision of us here, but have to grow organically to adopt good practices.
I already am. |
This has a lot to do with the proposals presented here. We don't only design the compatibility artifact, but also the way how it can be used later if we decide to release an official OCI tool or libraries for that. Especially, if we think to enable that for container runtimes for image selection use case. That's why those were asked to find out if you have some nice ideas for that. My questions were only to find out about some methods how we could build stuff around the tool. Not to jeopardize the proposal. If you propose that users can do anything then I have no more questions. Let it be. |
I think we should scope plugin discussion to some phase II of our group work - arguably if we make plugins for any artifact (or non-artifact) spec we will have similar questions. The plugin design here was an attempt to think through some of my ideas and anticipate that, but doesn't need to be considered formally part of the proposal. My main reason was that I was going to start working on tools / plugins and wanted to write down the design. It might help to scope initial discussion to just the "json parts" and acknowledge the desire for plugins (and come back to it). |
"type": "compspec", | ||
"label": "compatibilities", | ||
"nodes": { | ||
"mpi": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another question about the schema and compatibility artifact separation. because compatibilities are relied on the schema, so if lots of compatibility artifact have been delivered to production environment, which refer to the aleady published schema version. But unfortunately, if someday a big issue found in schema, and fixed with a new version, then all big issues(the relationships) are still used in the deliverered compatibility artifacts, if we want to fix the issue, all compatibility artifacts have to be re-released using new schema.
From this perspective, self-contained schema (customizable node relationships) and compatiblity will reduce the issue propagation, and reduce the fix cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's generally how it works with software too though. It's better to have versioning than not I think. Technically speaking, if you just ignore the schema (and version) you could have a "self-contained" schema. Having the entire schema within the artifact is not reasonable from a practicality standpoint. For an example, here is just the start of IOR, it doesn't even include output types yet. compspec/schemas@764520d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's just one namespace (I/O) there would be multiple defined for one file. It's huge redundancy for very little benefit IMHO. And if there were some issue with the schema, instead of updating one place you'd still need to update the many (thousands?) of artifacts instead. We likely just need some way to patch / give directive to those using the old schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChaoyiHuang you make a good point for why we don't want to embed logic for the actual choice in the artifact, because there could be some "big issue" - the implication being that it is with the logic of the selection. That is why I advocate for an approach where the compatibility specification is just that - an artifact with information, and the way that information is used to decide on image selection (the algorithm / logic) is not hard coded there. That's the main way you'd run into some issue like you are describing. If it's just adding / removing fields that is much less likely to warrant some crisis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to fix the issue, all compatibility artifacts have to be re-released using new schema.
That's also the case with any of these proposals. If there is a change to anything (field, logic, etc) that warrants a change to the artifact or image manifest. I would argue my approach is more flexible to that because often you can change the schema and then have some way to say "support previous versions" and there is no need to touch the artifacts. The other proposals, if they require everything hard coded into the artifact, cannot support that. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the question is about to build the artifact compatibilities specific relationship in schema:
if first io.archspec is about CPU, and the second one is about mpi and GPU, how to express the compatibility requirements: 1) cpu GenuineIntel amd64 + mpi v1.1 + nvidida GPU + (nvidia infiniband or arista infiband), 2) cpu GenuineAMD amd64 + mpiv1.2 + AMD GPU + arista infiniband)
That is also up to the tool. The relationships between things are defined by the upper level schema, if that is desired, but it doesn't have to be used
These kind of comaptibilities and/or relationship and combinations is image specific, and easy to change than standard. If they were built into schema, the problem is what I mentioned in this question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can disagree then. Thanks for your feedback.
I'll note that the metadata values themselves are still in the artifact. It's just the namespace, declarations, and relationships that are in the schema. There is exactly the same metadata in the artifact here than there is in, for example, proposal A, but it's extended to be much more useful in scenarios beyond "Match this one tag."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your argument is also akin to saying we should put the schema for an sbom in every sbom because it might change. That doesn't make sense to me.
Proposal D is an extension to Proposal C. Proposal C defines an explicit example of a compatibility artifact, meaning what a single artifact would look like paired alongside an image in a registry (in some way) to describe its compatibility for image selection or similar. Proposal D defines a compatibility schema that is maintained by a compatibility interest group, for which the goal is to define the namespace of allowed metadata attributes and relationships between them. These two proposals are complementary and would work together to allow for validation and understanding of relationships between terms, but without adding complexity to the compatibility artifact (Proposal C) directly. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
This proposal defines a simple, easy to read and understand compatibility spec that describes metadata attributes for compatibility. It can be paired with a compatibility schema that is maintained by a compatibility interest group, for which the goal is to define the namespace of allowed metadata attributes and relationships between them for the artifact. For the latter, the format is JGF "Json Graph Format" and so no new structure needs to be proposed by the working group. These two documents (the schema and artifact) are complementary and would work together to allow for validation and understanding of relationships between terms, but without adding complexity to the compatibility artifact directly. The expected use cases are image selection and scheduling, both of which I am prototyping and actively running experiments for.