u- parsing should always do relative URL resolution

This question is separate from but affects #9.

Currently [the parsing description for `u-` properties](http://microformats.org/wiki/microformats2-parsing#parsing_a_u-_property) is as follows:

> * if a.u-x[href] or area.u-x[href], then get the href attribute
> * else if img.u-x[src] or audio.u-x[src] or video.u-x[src] or source.u-x[src], then get the src attribute
> * else if video.u-x[poster], then get the poster attribute
> * else if object.u-x[data], then get the data attribute
> * if there is a gotten value, return the normalized absolute URL of it, following the containing document's language's rules for resolving relative URLs (e.g. in HTML, use the current URL context as determined by the page, and first <base> element, if any).
> * else parse the element for the value-class-pattern. If a value is found, return it.
> * else if abbr.u-x[title], then return the title attribute
> * else if data.u-x[value] or input.u-x[value], then return the value attribute
> * else return the textContent of the element after removing all leading/trailing whitespace and nested <script> & <style> elements.

Note that URL normalisation is applied on the fifth point. Values gained from VCP, `abbr`, `data`, or `input` are never normalised. Is this really correct?

I ran into an issue here when implementing [a partial feed](https://indieweb.org/h-feed#partial_feeds). In this case I did not want the feed title to link to itself as that made no sense in relation to the surrounding HTML. Thus I opted for `data` instead of `a`:

```html
<div class="h-feed" id="partial-feed">
  <h2 class="p-name"><data class="u-url" value="#partial-feed">Partial Feed</data></h2>
  …
</div>
```

However, because `data[value]` is never normalised, I am forced to write an absolute URL in there. That will hurt portability of the code.

I also think it is bad for `input` based values. My reasoning here is that a microformats editor should be able to use the same parsing algorithm on the editing and on the output. But if someone writes `#fragment` in an `input`-element text field the algorithm will output `#fragment`, and if this is converted to an `a`-element on save the same algorithm will output `https://example.com/#fragment`.

I propose moving the 5th point (“if there is a gotten value, return the normalized absolute URL <i>[…]</i>”) as far down the list as possible. Is there any reason why for specific elements this should not be done? I am not sure of `abbr` but can’t come up with any `abbr.u-x` use-cases either.

If people can come up with good reasons why outputs for `u-` properties should not always be normalised on VCP and `abbr` I still propose to move the `data`/`input` case to be above the normalisation step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

u- parsing should always do relative URL resolution #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

u- parsing should always do relative URL resolution #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions