-
-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Here are a few issues with parsing the current spec that have been highlighted by @abhillman's work.
These issues need to be addressed for the updated machine-readable spec to be fully useful.
Current output
These are the generated json files where the current parser generates incorrect output at times.
Issues
Outdated workaround
Remove the workaround in parse.py line 91
Update global attributes
List of global attributes needs updating in parse.py line 34.
This can be done manually for now, but it would be nice to be able to parse this automatically in future.
handling "the empty string" as an attribute keyword
When parsing attributes, in keyword lists such as "true"; "false"; the empty string
, the text "the empty string" is causing the list of keywords to not match the regular expression. Instead, it should be recognised, and the empty string should be emitted as a value_keywords
entry of "".
This leads to suboptimal output, for example in attributes.json line 614:
"hidden":
{
"desc": "Whether the element is relevant",
"elements": ["HTML"],
"value_keywords": [],
"value_type": "\"until-found\"; \"hidden\"; the empty string"
},
should read instead
"hidden":
{
"desc": "Whether the element is relevant",
"elements": ["HTML"],
"value_keywords": ["", "until-found", "hidden"],
"value_type": "Keywords"
},
Parenthesis in attribute elements parsed correctly
In attributes.json, the attribute element list for height requires parsing the HTML text canvas; embed; iframe; img; input; object; source (in picture); video
.
Currently it is parsing like so:
"height":
{
"desc": "Vertical dimension",
"elements":
[
"(in",
"canvas",
"embed",
"iframe",
"img",
"input",
"object",
"video"
],
"value_keywords": [],
"value_type": "Valid non-negative integer"
},
The elements array should instead read, with "(in" removed and "source" added:
"elements":
[
"canvas",
"embed",
"iframe",
"img",
"input",
"object",
"source",
"video"
],
value_type
should probably also have ".The actual rules are more complicated than indicated" appended.
Attribute keyword list chokes on trailing semicolon
attributes.json line1108 fails to record correct keywords for the popover
attribute due to a trailing semicolon, which should be ignored.
It currently reads
"popover":
{
"desc": "Makes the element a popover element",
"elements": ["HTML"],
"value_keywords": [],
"value_type": "\"auto\"; \"manual\";"
},
But should read
"popover":
{
"desc": "Makes the element a popover element",
"elements": ["HTML"],
"value_keywords": ["auto", "manual"],
"value_type": "Keywords"
},
Intellectual property notice updates
COPYING.txt (which is copied into the JSON) should be updated - in particular there is a new version of the W3C document license that needs linking to. This should also be updated in COPYING.md.