Description
This question is separate from but affects #9.
Currently the parsing description for u-
properties is as follows:
- if a.u-x[href] or area.u-x[href], then get the href attribute
- else if img.u-x[src] or audio.u-x[src] or video.u-x[src] or source.u-x[src], then get the src attribute
- else if video.u-x[poster], then get the poster attribute
- else if object.u-x[data], then get the data attribute
- if there is a gotten value, return the normalized absolute URL of it, following the containing document's language's rules for resolving relative URLs (e.g. in HTML, use the current URL context as determined by the page, and first element, if any).
- else parse the element for the value-class-pattern. If a value is found, return it.
- else if abbr.u-x[title], then return the title attribute
- else if data.u-x[value] or input.u-x[value], then return the value attribute
- else return the textContent of the element after removing all leading/trailing whitespace and nested <script> & <style> elements.
Note that URL normalisation is applied on the fifth point. Values gained from VCP, abbr
, data
, or input
are never normalised. Is this really correct?
I ran into an issue here when implementing a partial feed. In this case I did not want the feed title to link to itself as that made no sense in relation to the surrounding HTML. Thus I opted for data
instead of a
:
<div class="h-feed" id="partial-feed">
<h2 class="p-name"><data class="u-url" value="#partial-feed">Partial Feed</data></h2>
…
</div>
However, because data[value]
is never normalised, I am forced to write an absolute URL in there. That will hurt portability of the code.
I also think it is bad for input
based values. My reasoning here is that a microformats editor should be able to use the same parsing algorithm on the editing and on the output. But if someone writes #fragment
in an input
-element text field the algorithm will output #fragment
, and if this is converted to an a
-element on save the same algorithm will output https://example.com/#fragment
.
I propose moving the 5th point (“if there is a gotten value, return the normalized absolute URL […]”) as far down the list as possible. Is there any reason why for specific elements this should not be done? I am not sure of abbr
but can’t come up with any abbr.u-x
use-cases either.
If people can come up with good reasons why outputs for u-
properties should not always be normalised on VCP and abbr
I still propose to move the data
/input
case to be above the normalisation step.