Description
What happened?
While debugging json-schema-to-gbnf grammars, I noticed a few bugs / quirks and wanted to write them down somewhere.
additionalProperties
seems to default to false
(not matching spec).
By default, additional properties should be permitted. However, providing a schema like:
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
}
}
Then it correctly passes on these strings:
{ "number": 1600, "street_name": "Pennsylvania", "street_type":"Avenue"}
{ "street_name": "Pennsylvania" }
{ "number": 1600, "street_name": "Pennsylvania" }
{}
But then it improperly fails on the string:
{ "number": 1600, "street_name": "Pennsylvania", "street_type":"Avenue", "direction":"NW"}
This is clearly given in the json-schema docs as an example of a string that should match this schema, so we're doing something wrong.
Explicit "additionalProperties"=true
behavior is even worse.
If we change the above grammar to:
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
},
"additionalProperties": true
}
Then things really start to go awry. These strings should all pass (indeed, they passed before when we didn't explicitly set anything for additionalProperties
, but instead are failing now:
{"number":1600,"street_name":"Pennsylvania","street_type":"Avenue"}
{ "street_name": "Pennsylvania" }
{ "number": 1600, "street_name": "Pennsylvania" }
And our sample with an additional property still doesn't match:
{ "number": 1600, "street_name": "Pennsylvania", "street_type":"Avenue", "direction":"NW"}
The only string that matches out of the original is the empty object ({}
).
Looking at the generated GBNF, there is some weird stuff going on. Here is the GBNF with additionalProperties set implicitly:
char ::= [^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
decimal-part ::= [0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9])?)?)?)?)?)?)?)?)?)?)?)?)?)?)?
integral-part ::= [0-9] | [1-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9])?)?)?)?)?)?)?)?)?)?)?)?)?)?)?
number ::= ("-"? integral-part) ("." decimal-part)? ([eE] [-+]? integral-part)? space
number-kv ::= "\"number\"" space ":" space number
number-rest ::= ( "," space street-name-kv )? street-name-rest
root ::= "{" space (number-kv number-rest | street-name-kv street-name-rest | street-type-kv )? "}" space
space ::= " "?
street-name-kv ::= "\"street_name\"" space ":" space string
street-name-rest ::= ( "," space street-type-kv )?
street-type ::= "\"Street\"" | "\"Avenue\"" | "\"Boulevard\""
street-type-kv ::= "\"street_type\"" space ":" space street-type
string ::= "\"" char* "\"" space
And here is the GBNF from additionalProperties
set explicitly to true
:
additional-kv ::= string ":" space additional-value
additional-kvs ::= additional-kv ( "," space additional-kv )*
additional-value ::= object
array ::= "[" space ( value ("," space value)* )? "]" space
boolean ::= ("true" | "false") space
char ::= [^"\\] | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
decimal-part ::= [0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9])?)?)?)?)?)?)?)?)?)?)?)?)?)?)?
integral-part ::= [0-9] | [1-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9] ([0-9])?)?)?)?)?)?)?)?)?)?)?)?)?)?)?
null ::= "null" space
number ::= ("-"? integral-part) ("." decimal-part)? ([eE] [-+]? integral-part)? space
number-kv ::= "\"number\"" space ":" space number
number-rest ::= ( "," space street-name-kv )? street-name-rest
object ::= "{" space ( string ":" space value ("," space string ":" space value)* )? "}" space
root ::= "{" space (number-kv number-rest | street-name-kv street-name-rest | street-type-kv street-type-rest | additional-kvs )? "}" space
space ::= " "?
street-name-kv ::= "\"street_name\"" space ":" space string
street-name-rest ::= ( "," space street-type-kv )? street-type-rest
street-type ::= "\"Street\"" | "\"Avenue\"" | "\"Boulevard\""
street-type-kv ::= "\"street_type\"" space ":" space street-type
street-type-rest ::= additional-kvs
string ::= "\"" char* "\"" space
value ::= object | array | string | number | boolean | null
The key differences to note here are how street-type-rest
is now being defined (even though it was never defined in the original), and additional-kvs
seems to be getting appended to each property without a comma in between (nor an optional flag).
I haven't yet wrapped my brain around what all is going on with that, but I wanted to lay out how far I'd gotten on my own.
Unlike strings, enums don't support spaces between properties and values.
This is definitely in the "quirk" more than "bug" category, but when using a schema like:
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "enum": ["Street", "Avenue", "Boulevard"] }
}
}
Then validating against it means that:
{ "number": 1600, "street_type":"Avenue"}
is a valid string, but adding spaces around the enum value causes either of the following to fail:
{ "number": 1600, "street_type": "Avenue"}
{ "number": 1600, "street_type":"Avenue" }
Interestingly, adding spaces around a string value works fine, and these match the generated grammar just fine:
{ "number": 1600, "street_name": "Pennsylvania" }
Unsupported Attributes
We should probably build a list of unsupported attributes and note them in the documentation -- some that I've noticed thus far:
exclusiveMinimum
(probably can't be handled for anything except for special cases of 0, requiring either the presence of a-
or not)uniqueItems
-- not sure how we could support this without a regex engine that supports capture groups and lookbehinds and whatnot.
Name and Version
version: 3093 (7672ade)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.4.0
What operating system are you seeing the problem on?
Mac
Relevant log output
No response
Activity