Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra-blob JSON validation correctness and image-spec release policy #406

Open
wking opened this issue Oct 20, 2016 · 13 comments
Open

Intra-blob JSON validation correctness and image-spec release policy #406

wking opened this issue Oct 20, 2016 · 13 comments

Comments

@wking
Copy link
Contributor

wking commented Oct 20, 2016

We're moving towards keeping all intra-blob JSON validation in this repo since this discussion and @stevvooe has been redirecting intra-blob JSON validation PRs from image-tools to this repo. But working up #358 and #404 has left me concerned about how well our JSON Schema covers our spec. And JSON Schema is currently the only validation we do here, although see #403 about preparing to do more. With that intra-blob JSON validation happening in image-spec, we have to do at least one of:

a. Block image-spec releases until intra-blob JSON validation is robust enough for us to use it for certification.
b. Cut image-spec releases when the Markdown is ready, and then cut patch releases in a 1.0.x branch (or something) as the validation improves. And probably port those validation fixes between the 1.0.x branch (or whatever) and the master branch.
c. Cover any errors or deficiencies from the released image-spec intra-blob JSON validation with image-tool intra-blob JSON validation. For example, if we'd cut 1.0 without something like #404 landing, image-tools would have to land something like opencontainers/image-tools#45 to cover.

None of these sound particularly appealing to me, but as long as validation (even if it's just JSON Schema validation) lives in image-spec, my personal preference is (a). Do we want to form an official policy on how we'll handle this before we cut 1.0?

@vbatts
Copy link
Member

vbatts commented Mar 9, 2017

@wking I read this, and am not real clear on the point? I'm inclined to close this as I feel like I haven't heard how it's critical for v1 considering.

@wking
Copy link
Contributor Author

wking commented Mar 9, 2017 via email

@vbatts
Copy link
Member

vbatts commented Mar 9, 2017

again, you just restated the same kind of words with out clarifying the issue you're seeing. I'm not clear.

@wking
Copy link
Contributor Author

wking commented Mar 9, 2017 via email

@xiekeyang
Copy link
Contributor

@vbatts
We might need more validation on JSON objects, besides schema/*.json, such as how to handle dis-matched result, and should valid child fields media-type if exist. Current schema/validator.go take some validation work, but seems not enough. We have not a clear plan or idea for this.

@jonboulle
Copy link
Contributor

I think wking's position can be summarised as concern that we're not appropriately testing the json schema that we're shipping, and so it's quite possible (probable?) that whatever we call "1.0" actually has a divergence between the spec-as-markdown and spec-as-json-schema. Is that accurate?

@wking
Copy link
Contributor Author

wking commented Mar 10, 2017

Is that accurate?

Yes, and my concern is not just for the JSON Schema. Everything from this repo which is consumed by image-tools but not part of the Markdown spec is in the same boat.

For example, if the intra-blob validation doesn't grow a violated-SHOULD-warning API until 1.1, it will make it hard to warn/error on those when image-tools is validating a 1.0 image. You'd need one of the a/b/c approaches, or support for multiple versions in image-spec.

@xiekeyang
Copy link
Contributor

xiekeyang commented Mar 13, 2017

I remember we had discussed about the framework on validation, here I list them roughly, to help us for next improvement on image-spec and image-tools. Welcome forkers to add and correct.

  1. As previous suggestion, schema JSON had better to be dropped, instead to use golang directly. That mean schema.Validator should be reconstructed.
  2. Auto-detection of media type of JSON object should not be used any more. Consumer should point out the media they want to find out or validate.
  3. How to handle intra-blob validation, error or warn.

cc @stevvooe @wking @vbatts

@wking
Copy link
Contributor Author

wking commented Mar 13, 2017 via email

@stevvooe
Copy link
Contributor

This gets raised periodically, but I'm skeptical. JSON Schema is a
domain-specific language designed for exactly this purpose, and trying
to implement the same checks directly in Go seems like a lot more work
than it would be worth. And I don't think it impacts the release
process anyway, since the intra-blob validation API provided by
image-spec should be independent of the implementation (JSON Schema or
otherwise).

I think we have months of evidence proving the contrary. We have had JSON schema in place for some time and yet it has missed a number of edge cases. Here is an example where we have to implement secondary validation. I am not sure if we are using gojsonschema in a non-optimal manner or there are other problems but a solid piece of Go code could have shipped already.

It also looks like we either need to leverage `gojsonschema's error types or specify something on our own. The goal here would be to have the validator return all errors, then let the consumer decide which ones matter.

@xiekeyang
Copy link
Contributor

This is definitely going to impact the validation API. With you're
giving up on #452 and #403 being closed, I think we're just waiting on
the maintainers to show us what they want.

I close #452 partly because I know maintainers didn't achieve ideal agreement yet, partly because I feel it is not very graceful.

Actually we are mostly talking about validator especially on edge case. we almost agree it SHOULD be only for OCI image JSON objects, and MUST cover all of them, including each of intra items.

I had tried to use pure Go instead gojsonschema, but it is really so big effort. we have to implement secondary validation, maybe yes. But we should reform validator, if condition and function map looks a little ugly.

About validator API, we maybe need more input, such as reader, which kind of type (consumer should be able to input it directly), and mode for intra items(like customized or canonical).

It also looks like we either need to leverage `gojsonschema's error types or specify something on our own. The goal here would be to have the validator return all errors, then let the consumer decide which ones matter.

Agree.

@wking
Copy link
Contributor Author

wking commented Jul 26, 2017

On Thu, Mar 09, 2017 at 10:50:13AM -0800, W. Trevor King wrote:

Say we close this issue without establishing a policy. Then we cut 1.0. Then we find a bug in the JSON Schema, or the wrapping Go, or some other non-spec validation tooling. What happens next?

It's not documented in this repository (more on this in #713), but with 1.0.0, @vbatts excluded schema from semantic versioning:

While the schema and validation (./schema/) may continue to iterate and improve, this release specifically covers specification documents (pdf, html, or ./*.md) and reference golang source (./specs-go/).

That leaves (b) and (c) as the only tenable approaches. Whether (b) remains tenable depends on how robustly image-spec decides to support intra-blob validation in patch releases.

@sudo-bmitch
Copy link
Contributor

I'm looking through old issues to see what needs cleaning. Is this still an issue for anyone in the thread? If so, it would help to have a definition of what "inter-blob validation" means in this issue (perhaps I missed it, or it was in one of the links).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants