-
-
Notifications
You must be signed in to change notification settings - Fork 338
Extract vocabularies from the specs #1510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
a69f70e
3cd9d3a
3dc1967
493d9bf
8dabd03
25faf6c
f83f287
61ae0cd
59ea62b
d23ec05
8440ee4
009cabf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ The current approach to extending JSON Schema by providing custom keywords is | |
very implementation-specific and therefore not interoperable. | ||
|
||
To address this deficiency, this document proposes vocabularies as a concept | ||
and a new Core keyword, `$vocabulary` to support it. | ||
and a new Core keyword, `$vocabulary`, to support it. | ||
|
||
While the Core specification will define and describe vocabularies in general, | ||
the Validation specification will also need to change to incorporate some of | ||
|
@@ -16,9 +16,9 @@ in both documents. | |
## Current Status | ||
|
||
This proposal was originally integrated into both specifications, starting with | ||
the 2019-09 release, and has been extracted as the feature is incomplete. The | ||
feature, at best effort, was extracted in such a way as to retain the | ||
functionality present in the 2020-12 release. | ||
the 2019-09 release. For the upcoming stable release, the feature has been | ||
extracted as it is incomplete. The feature, at best effort, was extracted in | ||
such a way as to retain the functionality present in the 2020-12 release. | ||
|
||
Trying to fit the 2020-12 version into the current specification, however, | ||
raises some problems, and further discussion around the design of | ||
|
@@ -45,28 +45,191 @@ also apply to this document. | |
|
||
### Problem Statement | ||
|
||
The specification allows implementations to support user-defined keywords. | ||
However, this vague and open allowance has drawbacks. | ||
To support extensibility, the specification allows implementations to support | ||
keywords that are not defined in the specifications themselves. However, this | ||
vague and open allowance has drawbacks. | ||
|
||
1. This isn't a requirement, it is a permission. An implementation could just as | ||
easily (_more_ easily) choose _not_ to support user-defined keywords. | ||
1. Such support is not a requirement; it is a permission. An implementation | ||
could just as easily (_more_ easily) choose _not_ to support extension | ||
keywords. | ||
2. There is no prescribed mechanism by which an implementation should provide | ||
this support. As a result, each implementation that _does_ have the feature | ||
supports it in different ways. | ||
3. Support for any given user-defined keyword will be limited to that | ||
implementation. Unless the user explicitly configures another | ||
implementation, their keywords likely will not be supported. | ||
3. Support for any given user-defined keyword will be limited to the | ||
implementations which are explicitly configured for that keyword. For a user | ||
defining their own keyword, this becomes difficult and/or impossible | ||
depending on the varying support for extension keywords offered by the | ||
implementations the user is using. | ||
|
||
This exposes a need for the specification(s) to define a way for implementations | ||
to share knowledge of a keyword or group of keywords. | ||
This exposes a need for an implementation-agnostic approach to | ||
externally-defined keywords as well as a way for implementations to declare | ||
support for them. | ||
|
||
### Solution | ||
|
||
<!-- What is the solution? Include examples of use. --> | ||
Two new concepts, vocabularies and dialects, will be introduced into the Core | ||
specification. | ||
|
||
A vocabulary is identified by an absolute URI and is used to define a set of | ||
keywords. A vocabulary is generally defined in a human-readable _vocabulary | ||
description document_. (The URI for the vocabulary may be the same as the URL of | ||
where this vocabulary description document can be found, but no recommendation | ||
is made either for or against this practice.) | ||
|
||
A new keyword, `$vocabulary`, will be introduced into the Core specification as | ||
well. This keyword's value is an object with vocabulary URIs as keys and | ||
booleans as values. This keyword only has meaning within a meta-schema. A | ||
meta-schema which includes a vocabulary's URI in its `$vocabulary` keyword is | ||
said to "include" that vocabulary. | ||
|
||
```jsonc | ||
{ | ||
"$schema": "https://example.org/draft/next/schema", | ||
"$id": "https://example.org/schema", | ||
"$vocabulary": { | ||
"https://example.org/vocab/vocab1": true, | ||
"https://example.org/vocab/vocab2": true, | ||
"https://example.org/vocab/vocab3": false | ||
}, | ||
// ... | ||
} | ||
``` | ||
|
||
A dialect is the set of vocabularies listed by a meta-schema. It is ephemeral | ||
gregsdennis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and carries no identifier. | ||
gregsdennis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
_**NOTE** It is possible for two meta-schemas, which would have different `$id` | ||
values, to share a common dialect if they both declare the same set of | ||
vocabularies._ | ||
jdesrosiers marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
A schema that declares a meta-schema (via `$schema`) which contains | ||
`$vocabulary` is declaring that only those keywords defined by the included | ||
vocabularies are to be processed when evaluating the schema. All other keywords | ||
are to be considered "unknown" and handled accordingly. | ||
|
||
The boolean values in `$vocabulary` signify implementation requirements for each | ||
vocabulary. | ||
|
||
- A `true` value indicates that the implementation must recognize the vocabulary | ||
and be able to process each of the keywords defined it. If an implementation | ||
gregsdennis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
does not recognize the vocabulary or cannot process all of its defined | ||
keywords, the implementation must refuse to process the schema. These | ||
vocabularies are also known as "required" vocabularies. | ||
- A `false` value indicates that the implementation is not required to recognize | ||
the vocabulary or its keywords and may continue processing the schema anyway. | ||
However, keywords that are not recognized or supported must be considered | ||
"unknown" and handled accordingly. These vocabularies are also known as | ||
"optional" vocabularies. | ||
|
||
Typically, but not required, a schema will accompany the vocabulary description | ||
document. This _vocabulary schema_ should carry an `$id` value which is distinct | ||
from the vocabulary URI. The purpose of the vocabulary schema is to provide | ||
syntactic validation for the the vocabulary's keywords' values for when the | ||
schema is being validated by a meta-schema that includes the vocabulary. (A | ||
vocabulary schema is not itself a meta-schema since it does not validate entire | ||
schemas.) To facilitate this extra validation, when a vocabulary schema is | ||
Comment on lines
+129
to
+131
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We've always defined a meta-schema to be a schema that describes a schema. These vocabulary schemas do fit that definition. I understand what you're trying to say here, but I don't think saying it's not a meta-schema is the right approach. Didn't Henry update the spec at some point with a way to describe this behavior without having to say it's ignored or it's not a meta-schema? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. He didn't update the spec or anything. This actually came out of reviewing @jviotti's book and trying to rework my vocab schemas. I wrote about it here, and we pulled out They're not meta-schemas because they don't themselves describe full schemas; they are used by meta-schemas as components. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we're quite talking about the same thing. I entirely agree that vocabulary meta-schemas shouldn't include the
This distinction is what doesn't sit right with me. The way I see it, a component of a meta-schema is still a meta-schema. Section 8.1.2.2 is what I thought you were referring to here. Is that correct? It does seem like the wording there is not considering schemas referenced by a meta-schema a meta-schema, but that's never how I've understood the word or how we've defined the word. I've always used the terms dialect meta-schema and vocabulary meta-schema. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I wasn't referring to that section. I'm looking at Core 4.3.4 which defines "meta-schema":
A vocabulary schema doesn't describe a schema, therefore it's not a meta-schema. You wouldn't use the Meta-data vocab schema as a meta-schema; you reference it from a meta-schema. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting. We're using the same definition, but interpreting it differently. The way I see it, a vocabulary schema is validating a schema. It validates the syntax of keywords in a schema. I don't think the definition implies that a schema is only a meta-schema if it describes the entire dialect used by the schema. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're leading toward. Are you questioning the current practice of defining a vocab schema? Or are you saying that we should have a vocab schema that can function as a meta-schema? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I 100% agree that we need to explain this better and using more descriptive terms is a big part of doing that well. But, the way you're currently expressing this is confusing to me because this doesn't mesh with my (and I'm sure others) understanding of the term "meta-schema". That's why I'd prefer to address this by introducing new terms that are more specific. Earlier I mention that I use the terms "dialect meta-schema" and "vocabulary meta-schema", but since the term "meta-schema" appears to be inconsistently understood, perhaps "dialect schema" and "vocabulary schema" is better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not questioning the current practice. But I would not have made the mistake if we'd had a "vocab metaschema" schema that clearly defined what was allowed in a vocab metaschema. It would have been apparent that we were talking about a very similar, but different entity. And I am tending towards the idea of using the terms There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What would be the difference between these two meta-schemas? Everything that's allowed in a vocab schema is allowed in a dialect schema. Everything that's allowed in a dialect schema is allowed in a vocab schema. The only difference I can see would be their identifier. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @jdesrosiers on this: syntactically they're just schemas. The only difference is intent. A meta-schema is intended to describe/validate a schema. It can use vocab schemas (by reference) to accomplish that task. A vocab schema on its own does not describe a schema; it's merely a building block in a meta-schema. This is the distinction I'm making. (One could create a single-vocab meta-schema.) I don't think having a separate meta-schema for vocab schemas is needed since anything you do in a general schema should technically be allowed in a vocab schema. I'm not sure how useful all of those features would be, but I don't see a reason to forbid any features. |
||
provided, any meta-schema which includes the vocabulary should also contain a | ||
reference (via `$ref`) to the vocabulary schema's `$id` value. | ||
|
||
```jsonc | ||
{ | ||
"$schema": "https://example.org/draft/next/schema", | ||
"$id": "https://example.org/schema", | ||
"$vocabulary": { | ||
"https://example.org/vocab/vocab1": true, | ||
"https://example.org/vocab/vocab2": true, | ||
"https://example.org/vocab/vocab3": false | ||
}, | ||
"allOf": { | ||
{"$ref": "meta/vocab1"}, // https://example.org/meta/vocab1 | ||
{"$ref": "meta/vocab2"}, // https://example.org/meta/vocab2 | ||
{"$ref": "meta/vocab3"} // https://example.org/meta/vocab3 | ||
} | ||
// ... | ||
} | ||
``` | ||
|
||
Finally, the keywords in both the Core and Validation specifications will be | ||
divided into multiple vocabularies. The keyword definitions will be removed from | ||
the meta-schema and added to vocabulary schemas to which the meta-schema will | ||
contain references. In this way, the meta-schema's functionality remains the same. | ||
|
||
```json | ||
{ | ||
"$schema": "https://json-schema.org/draft/next/schema", | ||
"$id": "https://json-schema.org/draft/next/schema", | ||
"$vocabulary": { | ||
"https://json-schema.org/draft/next/vocab/core": true, | ||
"https://json-schema.org/draft/next/vocab/applicator": true, | ||
"https://json-schema.org/draft/next/vocab/unevaluated": true, | ||
"https://json-schema.org/draft/next/vocab/validation": true, | ||
"https://json-schema.org/draft/next/vocab/meta-data": true, | ||
"https://json-schema.org/draft/next/vocab/format-annotation": true, | ||
"https://json-schema.org/draft/next/vocab/content": true | ||
}, | ||
"$dynamicAnchor": "meta", | ||
|
||
"title": "Core and Validation specifications meta-schema", | ||
"allOf": [ | ||
{"$ref": "meta/core"}, | ||
{"$ref": "meta/applicator"}, | ||
{"$ref": "meta/unevaluated"}, | ||
{"$ref": "meta/validation"}, | ||
{"$ref": "meta/meta-data"}, | ||
{"$ref": "meta/format-annotation"}, | ||
{"$ref": "meta/content"} | ||
], | ||
} | ||
``` | ||
|
||
The division of keywords among the vocabularies will be in accordance with the | ||
2020-12 specification (for now). | ||
|
||
### Limitations | ||
|
||
<!-- Are there any limitations inherent to the proposal? --> | ||
#### Unknown Keywords and Unsupported Vocabularies | ||
|
||
This proposal, in its current state, seeks to mimic the behavior defined in the | ||
2020-12 specification. However, the current specification's disallowance of | ||
unknown keywords presents a problem for schemas that use keywords from optional | ||
vocabularies. (This is the topic of the discussion at | ||
https://github.com/orgs/json-schema-org/discussions/342.) | ||
|
||
In short, if a schema uses a keyword from an unknown _optional_ vocabulary, the | ||
implementation cannot proceed because unknown keywords are explicitly | ||
disallowed. However, not being able to proceed with evaluation is the behavior | ||
prescribed for _required_ vocabularies. Thus, if the behaviors for required and | ||
optional vocabularies is the same, then the boolean value is moot, which | ||
highlights that the structure of `$vocabulary` needs to be reconsidered. | ||
gregsdennis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Machine Readability | ||
|
||
The vocabulary URI is an opaque value. There is no data that an implementation | ||
can reference to identify the keywords defined by the vocabulary. The vocabulary | ||
schema _implies_ this, but scanning a `properties` keyword isn't very reliable. | ||
Moreover, such a system cannot provide metadata about the keywords. As such, the | ||
user must explicitly ensure that the implementation recognizes and supports the | ||
vocabulary, which isn't much of an improvement over the current state. | ||
gregsdennis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Having some sort of "vocabulary definition" file could alleviate this. | ||
|
||
One reason for _not_ having such a file is that, at least for functional | ||
keywords, the user generally needs to provide custom code to the implementation | ||
to process the keywords, thus performing that same explicit configuration | ||
anyway. (Such information cannot be gleaned from a vocabulary specification. For | ||
example, an implementation can't know what to do with a hypothetical `minDate` | ||
keyword.) | ||
|
||
#### Implicit Inclusion of Core Vocabulary | ||
|
||
Because the Core keywords (the ones that start with `$`) instruct an | ||
implementation on how a schema should be processed, its inclusion is mandatory | ||
and assumed. As such, while excluding the Core Vocabulary from the `$vocabulary` | ||
keyword has no effect, it is generally advised as common practice to include the | ||
Core Vocabulary explicitly. | ||
|
||
This can be confusing and difficult to use/implement, and we probably need | ||
something better here. | ||
|
||
## Change Details | ||
|
||
|
@@ -91,12 +254,14 @@ For example | |
``` | ||
--> | ||
|
||
_**NOTE** Since the design of vocabularies will be changing anyway, it's not worth the time and effort to fill in this section just yet. As such, please read the above sections for loose requirements. For tighter requirements, please assume conformance with the 2020-12 Core and Validation specifications._ | ||
|
||
## [Appendix] Change Log | ||
|
||
* [MMMM YYYY] Created | ||
* 2024-06-10 - Created | ||
|
||
## [Appendix] Champions | ||
|
||
| Champion | Company | Email | URI | | ||
|----------------------------|---------|-------------------------|----------------------------------| | ||
| Your Name | | | < GitHub profile page > | | ||
| Greg Dennis | | gregsdennis@yahoo.com | https://github.com/gregsennis | |
Uh oh!
There was an error while loading. Please reload this page.