Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate media type of manifest descendants #341

Merged
merged 1 commit into from
Nov 1, 2016
Merged

validate media type of manifest descendants #341

merged 1 commit into from
Nov 1, 2016

Conversation

xiekeyang
Copy link
Contributor

This PR is alternate of opencontainers/image-tools#10.
As we have some suggestion that descendants media type should be checked by JSON schema, I put up this PR.
JSON schema constrains config and layer fields type in manifest file. If this is acceptable, we should then update the vendor to this in #image-tools.

Signed-off-by: xiekeyang xiekeyang@huawei.com

@wking
Copy link
Contributor

wking commented Sep 23, 2016

On Fri, Sep 23, 2016 at 01:09:39AM -0700, xiekeyang wrote:

A schema/config-descriptor.json (24)

Descriptors can stand on their own (e.g. as ref payloads), -so-this-
-should-be-descriptor-schema.json.- [edit: your current name is ok].

"properties": {
"mediaType": {
"type": "string",
"pattern": "^application/vnd[.]oci[.]image[.]config[.]v1[+]json$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, descriptors can point at anything in general, but this is a descriptor that points at a config file. That makes sense to me.

@wking
Copy link
Contributor

wking commented Sep 23, 2016

On Fri, Sep 23, 2016 at 01:09:39AM -0700, xiekeyang wrote:

JSON schema constrains config and layer fields type in manifest
file. If this is acceptable, we should then update the vendor to
this in #image-tools.

I don't think we can handle this in JSON Schema. With 7cb3a30 this
PR successfully REQUIRES the media types recommended in #304 (still in
flight) for manifest descriptors. But #304 isn't suggesting that
requirement for manifest instances (which is what JSON Schema tests),
it's MUSTing implementations to support those types, and SHOULDing
manifests to use them. I think we want warnings, not errors, when a
config breaks that SHOULD [1], and I don't know of a way to
distinguish between warnings and errors in JSON Schema.

[1]; opencontainers/image-tools#10 (comment)

@xiekeyang
Copy link
Contributor Author

PTAL

@wking
Copy link
Contributor

wking commented Sep 29, 2016

On Fri, Sep 23, 2016 at 08:16:00AM -0700, W. Trevor King wrote:

I don't think we can handle this in JSON Schema. With 7cb3a30
this PR successfully REQUIRES the media types recommended in #304
(still in flight) for manifest descriptors. But #304 isn't
suggesting that requirement for manifest instances (which is what
JSON Schema tests), it's MUSTing implementations to support those
types, and SHOULDing manifests to use them. I think we want
warnings, not errors, when a config breaks that SHOULD 1, and I
don't know of a way to distinguish between warnings and errors in
JSON Schema.

#304 landed a few days ago 1, so the approach you're taking here
(strictly restricting images to those types) is officially too strong.
I think we need to go back to Go and use an option (--strict?
--portable?) to decide whether images that use features beyond the
minimum unpacker requirements are warnings or errors. I've mocked up
something along those lines (in the context of tar features) in [2](linked from #342).

@xiekeyang
Copy link
Contributor Author

@wking

I think we need to go back to Go and use an option (--strict?
--portable?) t

I suppose that you are meaning to modify schema.MediaTypeManifest.Validate(), internally, as @stevvooe suggestion opencontainers/image-tools#10 (comment). However, it seem to be a little inelegant to add a code section in it for manifest descendants case branch, like:

func (v Validator) Validate(src io.Reader) error { 

  ...
  if v==MediaTypeManifest{  //or switch-case here
    err:=validateDescendants(src)
    if err!=nil{
       return err
    }
  }
  ...

}

If I am right?

cc @stevvooe

@wking
Copy link
Contributor

wking commented Sep 30, 2016

On Thu, Sep 29, 2016 at 11:25:37PM -0700, xiekeyang wrote:

@wking

I think we need to go back to Go and use an option (--strict?
--portable?) t

I suppose that you are meaning to modify
schema.MediaTypeManifest.Validate(), internally, as @stevvooe
suggestion
opencontainers/image-tools#10 (comment).

I'm not convinced that the maintainer consensus is to keep more
validation than JSON Schema in this repo (some discussion in #337).
But yeah, the Go I'm suggesting should live wherever the maintainers
want the beyond-the-JSON-Schema Go validation code to live.

However, it seem to be a little inelegant to add a code section in
it for manifest descendants case branch…

I've mocked up a callback-based validation approach in that commit.
See 1, where I setup a map of validators for each supported media
type (very stubby at the moment, because my commit is currently based
on opencontainers/image-tools#5 and I don't want to keep rebasing this
callback validation if the ref/CAS interfaces float for another 3
months). Each media-type validator will check its blob (possibly
using JSON Schema as part of that check) and then recurse into another
Validate() call for any children. Of course, this is a problem for
validating spec examples (where there is no CAS engine from which to
retrieve the children), so I think the spec should stick to
JSON-Schema-only validation and not worry about any cases that doesn't
cover.

@xiekeyang
Copy link
Contributor Author

@wking
I fix this PR. As previous suggestion, I add OCI and Compatibility Matrix media types to media type constraints of descendants. And these constraints are implemented by Go.
And fix unit test according to latest media type spec.
PTAL.
cc @opencontainers/image-spec-maintainers

manifest: `{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/octet-stream",
"mediaType": "application/vnd.docker.container.image.v1+json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this? The octet-stream type is from #145, where @philips made it sound like this manifest had been caught in the wild. If that's the case, we shouldn't be messing with the source.

@@ -137,12 +137,12 @@ func TestBackwardsCompatibilityManifest(t *testing.T) {
// "Accept: application/vnd.docker.distribution.manifest.v2+json" \
// https://registry-1.docker.io/v2/library/docker/manifests/sha256:888206c77cd2811ec47e752ba291e5b7734e3ef137dfd222daadaca39a9f17bc
{
digest: "sha256:888206c77cd2811ec47e752ba291e5b7734e3ef137dfd222daadaca39a9f17bc",
digest: "sha256:d901d5244c94db8151415bf3a1d3a0b10e7468e4997786b01c287cbcbaeca446",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to bump this, but if we do we should probably bump the comment above or explain why the hashes are now different.

MediaTypeDockerImageConfig = "application/vnd.docker.container.image.v1+json"

// MediaTypeNonDistributable specifies the non-distributable layer's mediaType.
MediaTypeNonDistributable = "application/vnd.oci.image.layer.nondistributable.v1.tar+gzip"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think we want to export these as part of our public API (discussion in #320).

@@ -44,6 +50,13 @@ func (v Validator) Validate(src io.Reader) error {
return errors.Wrap(err, "unable to read the document file")
}

if f := mapValidateDescendants[v]; f != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idiomatic way to handle this is with:

f, ok := mapValidateDescendants[v]
if ok {
  …validate descendant…
}

Copy link
Contributor Author

@xiekeyang xiekeyang Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking ,
In the case of setting nil to some key, the checking by ok will have risk.
For that f is nil when ok is true, the next calling f will lead to panic.
So I think it is safe to check f is not nil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine panicking if somebody inserts a nil value in mapValidateDescendants. I would be happier if Go's typing was strict enough to error with cannot convert nil to type validateDescendantsFunc when they attempted the insertion, but I don't see a way to do that if the zero value for function types is nil. To be extra safe and clear, I'd recommend handling the three cases explicitly:

f, ok := mapValidateDescendents[v]
if ok {
    if f == nil {
        …internal error: shame on you for putting a nil value in mapValidateDescendents…
    }
    …validate descendants…
}


func init() {
mapValidateDescendants[MediaTypeManifest] = validateManifestDescendants
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for init(), you can declare a package-level constant.

if layer.MediaType != string(v1.MediaTypeImageLayer) &&
layer.MediaType != MediaTypeDockerImageLayer &&
layer.MediaType != MediaTypeNonDistributable {
return fmt.Errorf("illegal layer mediaType: %s", layer.MediaType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still too strong. You want to warn if the type is not an OCI type, but it's not an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking, here I've 2 questions:

  1. Do you think we should emit warn for all string not equal to OCI type, even for buzz words? It seems too loose.
  2. Current logger have no warn level. Do you agree that we import logrus here, to print different levels log?
    Thanks so much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Wed, Oct 05, 2016 at 07:10:01PM -0700, xiekeyang wrote:

  1. Do you think we should emit warn for all string not equal to OCI type, even for buzz words? It seems too loose.

We're warning them, and they would be wise to take the warning seriously. I don't think we can do anything stronger by default; how do we know that the strange type they're using isn't handled by their tooling? But I think image-tools should have a --strict flag (for “error on any features not required in compliant unpackers”) and/or an --unpackable flag (for “error on any features I don't recognize”).

  1. Current logger have no warn level. Do you agree that we import logrus here, to print different levels log?

I think logging is orthogonal. These tests are the primary output of validation, so I think a validator should be printing the validation results (including warnings like this) to stdout (or some sort of writer). In my callback-based mock-up, I'm using tap.go, and in that case warnings would look like:

ok 7 # skip sha256-a… layers[2] has an unknown media type: image/png

@xiekeyang
Copy link
Contributor Author

@wking
Fixed it according to your suggestion. PTAL.

@@ -44,6 +52,13 @@ func (v Validator) Validate(src io.Reader) error {
return errors.Wrap(err, "unable to read the document file")
}

if f, ok := mapValidateDescendants[v]; ok && f != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to check both ok and nil if you're silently ignoring the nil case. I think you should either skip the nil check and pacnic if someone stuck a nil value in mapValidateDescendants or explicitly check for and error out if someone stuck a nil value in mapValidateDescendants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix it.

}

if header.Config.MediaType != string(v1.MediaTypeImageConfig) {
fmt.Printf("warning: config %s has a non OCI media type: %s\n", header.Config.Digest, header.Config.MediaType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem isn't that it has a non-OCI type (application/vnd.oci.image.manifest.list.v1+json is an OCI type, and it deserves this warning). My suggested wording is:

{blobDigest} {property} has an unknown media type: {type}

That's a bit complicated with your APIs, because you take a reader and not a descriptor or digest. In my mock-up, the validator takes a digest and CAS engine, so it's easy to make messages like that. But it makes sense to want a way to validate a reader without having to inject it into a CAS engine, and for that initial reader-based validation I think the message should just be:

{blobDigest} {property} has an unknown media type: {type}

For this line here, that's:

fmt.Println("warning: config has an unknown media type: %s", header.Config.MediaType)

}

for _, layer := range header.Layers {
if layer.MediaType != string(v1.MediaTypeImageLayer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix it.

type validateDescendantsFunc func(r io.Reader) error

var mapValidateDescendants = map[Validator]validateDescendantsFunc{
MediaTypeManifest: validateManifestDescendants,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently just looking at the types of the descriptors. It doesn't follow those references to actually validate the descendants (e.g. “the manifest claims sha256:a… is a application/vnd.oci.image.config.v1+json. Does it validate as such?). I think we eventually want this callback-based validation to handle that, but I don't see an easy way to do it without a CAS-engine API (once you have that it is pretty easy), so I'm fine punting on that for this PR and only doing these local checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking
At beginning I work this in image-tool project, but seems improper.
In schema package, validateManifestDescendants acting as the validator.Validate child is what I think the nice way.
Callback handler must impact on up-layer calling in image-tool project, to which I worry maintainers not fully agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that you wouldn't want all of recursive validation, all recursive validation code in image-spec, and CAS-engine code in image-tools. My personal preference there is to do non-recursive JSON Schema validation here and to handle recursive validation and the CAS-engine in image-tools. @stevvooe seems to prefer keeping more than just JSON Schema validation in image-spec, but @philips has moved a fair bit of stuff to image-tools, so I'm not sure there's a clear maintainer position on this yet. Once there is a clear position, it should be fairly straightforward to put this logic in the approved location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is simply validating mediatypes. It doesn't actually validate the descendent objects. This style of validation clearly belongs within the image-spec repo.

The maintainer position is clear and we have discussed on the call multiple times, without dissent.

Please stop fomenting disagreement where there is none.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stop fomenting disagreement where there is none.

The maintainer position has been clear on this since the 2016-10-12 meeting. I don't think it was particular clear when I made these comments on 2016-10-05.

@@ -44,6 +52,16 @@ func (v Validator) Validate(src io.Reader) error {
return errors.Wrap(err, "unable to read the document file")
}

if f, ok := mapValidateDescendants[v]; ok {
if f == nil {
return fmt.Errorf("internal error: shame on you for putting a nil value in mapValidateDescendents")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taking me too literally ;). How about:

return fmt.Errorf("internal error: mapValidateDescendents[%q] is nil", v)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the message was rough. Fix it.

@wking
Copy link
Contributor

wking commented Oct 6, 2016

I have a few minor quibbles with the error messages, but I think
3df641c is close enough to be worth landing (assuming that maintainers
want more than JSON Schema in image-spec 1). We can talk about
polishing the error messages 2, TAP 3, recursive validation 4,
etc. in follow-up PRs.

Signed-off-by: xiekeyang <xiekeyang@huawei.com>
@xiekeyang
Copy link
Contributor Author

@opencontainers/image-spec-maintainers
Hopefully you can put up some proposal to move this PR forward.

@vbatts
Copy link
Member

vbatts commented Oct 6, 2016

So, the objective here is to descend into a manifest to validate the other objects, right?
What is expected? When a known media-type is specified and an object of the matching checksum is found, that the object is of the expected media-type. Right?
What is not expected, is if there is an object of unknown media-type found, that the test OUGHT to not fail, because it just does not know how to validate unknown objects.

Are we in agreement up to this point?

@wking
Copy link
Contributor

wking commented Oct 6, 2016

On Thu, Oct 06, 2016 at 07:09:05AM -0700, Vincent Batts wrote:

So, the objective here is to descend into a manifest to validate the
other objects, right?

This PR doesn't descend into children, because you can't do that
without a CAS engine to fetch them by digest 1. We'll want that
eventually (but probably not in this repo 2). As it stands with
dd4cbf1, this PR just checks #304 on the blob itself (checking the
media types the blob's descriptors claim for the children, but not
actually looking at the child blobs).

What is expected? When a known media-type is specified and an
object of the matching checksum is found, that the object is of the
expected media-type. Right?

That would be nice, but currently this PR just checks to see if the
referencing descriptor has one of the media types that #304 requires
unpackers to support.

What is not expected, is if there is an object of unknown
media-type found, that the test OUGHT to not fail, because it just
does not know how to validate unknown objects.

We're not validating the children yet, known media type or not. But
dd4cbf1 (appropriately, I think) warns on media types that unpackers
are not required to support. And a future PR could add an
--unpackable or --strict flag to turn those warnings into errors.

@xiekeyang
Copy link
Contributor Author

@vbatts
As @wking explained, this PR is to fix the insufficient point that children in manifest was not validated.
It has been token so many challenge as return error or just print error, place it in image tool or image spec, take into Compatibility Matrix or not ...

I've tried to understand maintainers proposal, and improve this PR. Now hopefully it can be accepted.

@xiekeyang
Copy link
Contributor Author

@stevvooe If this is the implementation way what you want?

@stevvooe
Copy link
Contributor

stevvooe commented Oct 19, 2016

@xiekeyang This seems sufficient. It is just making sure the media types are valid. It is too bad that json schema can't do this.

LGTM

Approved with PullApprove

wking added a commit to wking/image-spec that referenced this pull request Oct 20, 2016
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.  In situations
where you don't know the blob digest, the new DigestByte will help you
calculate it (for a byte array).

There is also a new 'strict' parameter to distinguish between
compliant images (which should only pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should pass regardless of strict).  The current JSON
Schemas are not strict, and I expect we'll soon gain Go code to handle
the distinction (e.g. [4]).  So the presence of 'strict' in the
Validator type is future-proofing our API and not exposing a
currently-implemented feature.

I've made the minimal sane changes to cmd/ and image/, because we're
dropping them from this repository [2] (and continuing them in
runtime-tools).

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-spec that referenced this pull request Oct 20, 2016
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.  In situations
where you don't know the blob digest, the new DigestByte will help you
calculate it (for a byte array).

There is also a new 'strict' parameter to distinguish between
compliant images (which should only pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should pass regardless of strict).  The current JSON
Schemas are not strict, and I expect we'll soon gain Go code to handle
the distinction (e.g. [4]).  So the presence of 'strict' in the
Validator type is future-proofing our API and not exposing a
currently-implemented feature.

I've made the minimal sane changes to cmd/ and image/, because we're
dropping them from this repository [2] (and continuing them in
runtime-tools).

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-spec that referenced this pull request Oct 20, 2016
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.  In situations
where you don't know the blob digest, the new DigestByte will help you
calculate it (for a byte array).

There is also a new 'strict' parameter to distinguish between
compliant images (which should only pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should pass regardless of strict).  The current JSON
Schemas are not strict, and I expect we'll soon gain Go code to handle
the distinction (e.g. [4]).  So the presence of 'strict' in the
Validator type is future-proofing our API and not exposing a
currently-implemented feature.

I've made the minimal sane changes to cmd/ and image/, because we're
dropping them from this repository [2] (and continuing them in
runtime-tools).

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-spec that referenced this pull request Oct 21, 2016
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.  In situations
where you don't know the blob digest, the new DigestByte will help you
calculate it (for a byte array).

There is also a new 'strict' parameter to distinguish between
compliant images (which should only pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should pass regardless of strict).  The current JSON
Schemas are not strict, and I expect we'll soon gain Go code to handle
the distinction (e.g. [4]).  So the presence of 'strict' in the
Validator type is future-proofing our API and not exposing a
currently-implemented feature.

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
@vbatts
Copy link
Member

vbatts commented Nov 1, 2016

LGTM

Approved with PullApprove

@vbatts vbatts merged commit 719b903 into opencontainers:master Nov 1, 2016
wking added a commit to wking/image-spec that referenced this pull request Nov 1, 2016
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.  In situations
where you don't know the blob digest, the new DigestByte will help you
calculate it (for a byte array).

There is also a new 'strict' parameter to distinguish between
compliant images (which should always pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should only pass if strict is true).  The current
JSON Schemas are not strict, but the config/layer media type checks in
ValidateManifest exercise this distinction.

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
@xiekeyang xiekeyang deleted the med-val branch November 4, 2016 06:44
wking added a commit to wking/image-spec that referenced this pull request Jan 19, 2017
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.

There is also a new 'strict' parameter to distinguish between
compliant images (which should always pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should only pass if strict is true).  The current
JSON Schemas are not strict, but the config/layer media type checks in
ValidateManifest exercise this distinction.

Also use go-digest for local hashing now that we're vendoring it.

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-spec that referenced this pull request Jan 22, 2017
With image-tools split off into its own repository, the plan seems to
be to keep all intra-blob JSON validation in this repository and to
move all other validation (e.g. for layers or for walking Merkle
trees) in image-tools [1].  All the non-validation logic currently in
image/ is moving into image-tools as well [2].

Some requirements (e.g. multi-parameter checks like allowed OS/arch
pairs [3]) are difficult to handle in JSON Schema but easy to handle
in Go.  And callers won't care if we're using JSON Schema or not; they
just want to know if their blob is valid.

This commit restructures intra-blob validation to ease the path going
forward (although it doesn't actually change the current validation
significantly).  The old method:

  func (v Validator) Validate(src io.Reader) error

is now a new Validator type:

  type Validator(blob io.Reader, descriptor *v1.Descriptor, strict bool) (err error)

and instead of instantiating an old Validator instance:

  schema.MediaTypeImageConfig.Validate(reader)

there's a Validators registry mapping from the media type strings to
the appropriate Validator instance (which may or may not use JSON
Schema under the hood).  And there's a Validate function (with the
same Validator interface) that looks up the appropriate entry in
Validators for you so you have:

  schema.Validate(reader, descriptor, true)

By using a Validators map, we make it easy for library consumers to
register (or override) intra-blob validators for a particular type.
Locations that call Validate(...) will automatically pick up the new
validators without needing local changes.

All of the old validation was based on JSON Schema, so currently all
Validators values are ValidateJSONSchema.  As the schema package grows
non-JSON-Schema validation, entries will start to look like:

  var Validators = map[string]Validator{
    v1.MediaTypeImageConfig: ValidateConfig,
    ...
  }

although ValidateConfig will probably use ValidateJSONSchema
internally.

By passing through a descriptor, we get a chance to validate the
digest and size (which we were not doing before).  Digest and size
validation for a byte array are also exposed directly (as
ValidateByteDigest and ValidateByteSize) for use in validators that
are not based on ValidateJSONSchema.  Access to the digest also gives
us a way to print specific error messages on failures.

There is also a new 'strict' parameter to distinguish between
compliant images (which should always pass when strict is false) and
images that only use features which the spec requires implementations
to support (which should only pass if strict is true).  The current
JSON Schemas are not strict, but the config/layer media type checks in
ValidateManifest exercise this distinction.

Also use go-digest for local hashing now that we're vendoring it.

[1]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2016/opencontainers.2016-10-12-21.01.log.html#l-71
[2]: opencontainers#337
[3]: https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-5.5.5
[4]: opencontainers#341

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants