Skip to content

Add tests, algoritm, grammar, and description for @vocab: @base. #603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 13, 2018

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Mar 2, 2018

Fixes #488.

cc/ @pchampin @kidehen

@gkellogg gkellogg added this to the JSON-LD 1.1 milestone Mar 2, 2018
@davidlehn
Copy link
Member

Oops, merged other branch that also had compact 0090 test.

Copy link
Contributor

@iherman iherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking into the JSON-LD 1.1 syntax document, and I did not find any statement that makes it absolutely clear that the default value of @base is the base URI. The text only says:

In section 1.7:

@base: Used to set the base IRI against which to resolve those relative IRIs interpreted relative to the document. This keyword is described in section 4.2 Base IRI.

In section 4.2:

JSON-LD allows IRIs to be specified in a relative form which is resolved against the document base according section 5.1 Establishing a Base URI of [RFC3986]. The base IRI may be explicitly set with a context using the @base keyword.

I also looked at the API document, and I did not find any reference to some sort of a default @base value.

This means that the

"@vocab" : @base

is not clearly defined in the absence of an explicit @base setting. And, actually, it is exactly that situation that is needed (if my understanding is correct) to cover Ted's use case.

Copy link
Contributor

@azaroth42 azaroth42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@azaroth42
Copy link
Contributor

I disagree with Ivan's review. 4.2 says:

This document uses an empty @id, which resolves to the document base.

So while it's non-normative, much of the specification is non-normative and would require a massive rewrite to change.

@iherman
Copy link
Contributor

iherman commented Mar 2, 2018

@azaroth42, you are right that that section is informative. However, can someone show me how can I deduce, normatively, that if I have

"@vocab" : "@base"

and @base is not defined in the document, then the value of @vocab should be the value of the baseURI? I looked at the normative section 4.5 of the syntax as well as in the normative 4.1.2. section of the api document and I did not find this.

Actually... I believe that what has to be done is to expand on 3.6.3 of the algorithm in section 4.1.2 of the api document. That sub-sub section should say that if @base is not defined elsewhere, the value of @vocab must be the baseURI.

As it stands, unless I miss a statement elsewhere, I maintain that the spec is underspecified...

(As an aside, the reference to @base in 4.5.11. of the syntax document where @vocab is detailed, should also be marked as a "changed" class)


(That being said: the algorithm is already very complicated. I am not 100% sure any more that we have to go down that route... but that is another discussion.)

@azaroth42
Copy link
Contributor

Granted, though I'm pretty sure we would run into many similar situations throughout the specification.

This could be raised as a separate issue?

@iherman
Copy link
Contributor

iherman commented Mar 2, 2018 via email

@gkellogg
Copy link
Member Author

gkellogg commented Mar 2, 2018

Most of the Syntax document is non-normative, by design; the normative parts are in the appendix describing the grammar. The informative text on the document base is in section 4.2 Base IRI. The grammar makes it clear that @base is an appropriate value for @vocab.

Behavior is described in the API document where base IRI is defined. Note that, if we simply referenced it from the syntax document, it would appear there as well. We could potentially improve this definition as follows:

The base IRI is an absolute IRI established in the context using a @base definition, or is based on the JSON-LD document location. The base IRI is used to turn relative IRIs into absolute IRIs.

Normatively, the WebIDL interface describes how the initial value of the base IRI is set. In the Context Processing Algorithm, the use of @base to update the base IRI is discussed.

@iherman
Copy link
Contributor

iherman commented Mar 2, 2018

@gkellogg,

Most of the Syntax document is non-normative, by design; the normative parts are in the appendix describing the grammar. The informative text on the document base is in section 4.2 Base IRI. The grammar makes it clear that @base is an appropriate value for @vocab.

I am repeating myself: that section does not say anything about what happens if the @base is not.

(The grammar is now in section 6, not in the Appendix...) I did not question whether @base is an appropriate value for @vocab. The issue is elsewhere.

Behavior is described in the API document where base IRI is defined. Note that, if we simply referenced it from the syntax document, it would appear there as well. We could potentially improve this definition as follows:

Indeed, it says in the IDL:

base
The base IRI to use when expanding or compacting the document. If set, this overrides the input document's IRI

But this has several problems:

  • If this is where we start, then the formal specification should say something like if @vocab is set to @base then the value is set to the base IRI or the document's IRI.
  • However, the previous step would not work, because we would loose the ability of using undefined terms in JSON-LD that would then be ignored. That is an important feature
  • Besides... the WebIDL section is not normative! Ie, whatever we say here does not really count in the formal sense:-(

So, I am sorry but I still maintain that the spec is underspecified for the case @base is not used.

I do not want to make life difficult. We can decide not to go into all these details for the CG draft, but I believe that this must be taken care of in the final spec. Is there already a test case for this corner case?


B.t.w., whatever is done I think that the corner case of @pchampin and @kidehen should be documented explicitly. One of the very important aspect of JSON-LD is that one can use terms in the file that, when undefined in @context, are simply ignored by JSON-LD processors. I have seen examples that used this feature. Adding "@vocab" : @base will change this behaviour fundamentally because, suddenly, all terms will be interpreted. I actually wonder whether it is a good idea, it may be very confusing for the user...

@gkellogg
Copy link
Member Author

gkellogg commented Mar 2, 2018

@iherman not trying to be difficult, but I don't quite appreciate the problem. Section 4.3.1 says:

Since json-ld-1.1, the vocabulary mapping in the active context can be set to @base, which causes terms which are expanded relative to the vocabulary, such as the keys of node objects, to use the base IRI to create absolute IRIs.

We could possibly expand on this with:

, instead of a separate absolute IRI as described in the previous section.

In the 4.2 Base IRI, the document base is defined:

JSON-LD allows IRIs to be specified in a relative form which is resolved against the document base according section 5.1 Establishing a Base URI of [RFC3986]. The base IRI may be explicitly set with a context using the @base keyword.

So, when we say @vocab: @base, we define this to mean using base IRI, and we define that base IRI comes from either the document base or @base.

It would be incorrect to say that @base has a default, as the context will only have @base if it's explicitly defined; although implementations may choose to default this from the document base, it's not described as such in the spec.

Indeed, it says in the IDL:

base
The base IRI to use when expanding or compacting the document. If set, this overrides the input document's IRI

Earlier in the WebIDL section it says the following:

If the documentLoader option is specified, it is used to dereference remote documents and contexts. The documentUrl in the returned RemoteDocument is used as base IRI and the contextUrl is used instead of looking at the HTTP Link Header directly. For the sake of simplicity, none of the algorithms in this document mention this directly.

The documents try to consistently use base IRI, rather than @base. It's just that @base is used in slightly different ways: as a key in the context, it is used to change the value of base IRI. As a value of @vocab it is used to say to use base IRI to evaluate vocabulary-relative IRIs. The meanings are close enough, that I didn't think it worth using a completely separate term.

... I think that the corner case of @pchampin and @kidehen should be documented explicitly

The example in 4.3.1 is pretty close to @kidehen's use case, but I'm happy to elaborate more if it is not specific enough, which is why I asked for specific feedback on the PR from them.

@iherman
Copy link
Contributor

iherman commented Mar 3, 2018

Since json-ld-1.1, the vocabulary mapping in the active context can be set to @base, which causes terms which are expanded relative to the vocabulary, such as the keys of node objects, to use the base IRI to create absolute IRIs.

We could possibly expand on this with:

, instead of a separate absolute IRI as described in the previous section.

I am not sure it is necessary to add this, but does not harm.

In the 4.2 Base IRI, the document base is defined:

JSON-LD allows IRIs to be specified in a relative form which is resolved against the document base according section 5.1 Establishing a Base URI of [RFC3986]. The base IRI may be explicitly set with a context using the @base keyword.

So, when we say @vocab: @base, we define this to mean using base IRI, and we define that base IRI comes from either the document base or @base.

To be very precise this statement does not come from this quote. I guess what counts is the Terminology section which says

base IRI
The base IRI is an absolute IRI established in the context, or is based on the JSON-LD document location. The base IRI is used to turn relative IRIs into absolute IRIs.

It would be incorrect to say that @base has a default, as the context will only have @base if it's explicitly defined; although implementations may choose to default this from the document base, it's not described as such in the spec.

O.k., I am fine with this argument.

Indeed, it says in the IDL:

base
The base IRI to use when expanding or compacting the document. If set, this overrides the input document's IRI

Earlier in the WebIDL section it says the following:

If the documentLoader option is specified, it is used to dereference remote documents and contexts. The documentUrl in the returned RemoteDocument is used as base IRI and the contextUrl is used instead of looking at the HTTP Link Header directly. For the sake of simplicity, none of the algorithms in this document mention this directly.

The documents try to consistently use base IRI, rather than @base. It's just that @base is used in slightly different ways: as a key in the context, it is used to change the value of base IRI. As a value of @vocab it is used to say to use base IRI to evaluate vocabulary-relative IRIs. The meanings are close enough, that I didn't think it worth using a completely separate term.

O.k. I indeed missed the section on the document loader, which sets a default value to base URI. So the full line of argument is indeed correct, using the extra reference to the terminology. However, there is a major issue with this, nevertheless: except for the Terminology section none of the sections you refer to are normative! From a specification point of view this means that this line of argument is not really decisive:-(

I do understand the intention of separating the user facing part of the document from the normative spec. Knowing the unreadability of many other specs these days, I am all in favor, in fact. But, nevertheless, we should be able to trace back the answer to the original question "what happens if there is no @base in my document?" using exclusively the normative sections, probably in section 4.1 of the API document.

Having looked at that part, but not really familiar with all the details, I believe that point 3.6.3. of section 4.1.3 has to be expanded. Actually, it would make it more readable by separating that into a separate 3.6.4, something like:

  • If the value is @base the value of baseIRI must be used

although the normative sections do not say that that baseIRI is set, by default, to the document URI (the reference you gave is not normative). That should be clear in this section (or, alternatively, the WebIDL sections should be normative, which may actually be the best way forward)

(I will write a separate comment on the narrative aspect, just to separate the comments...)

@iherman
Copy link
Contributor

iherman commented Mar 3, 2018

@gkellogg, it took us several comments to track down the right answer to the question "what happens if there is no @base in my document?"; I hope you agree that it should be made easier...

I tried to see what could be added to section 4.3.1., namely a new example should be added after example 22. Something like:

{
  "@context": {
    "@vocab": "@base"
  },
  "@id": "http://example.org/places#BrewEats",
  "@type": "#Restaurant",
  "#name": "Brew Eats"
  ...
}

resulting in

[{
  "@id": "http://example.org/places#BrewEats",
  "@type": ["http://jsonld-example/ex.json#Restaurant"],
  "http://jsonld-example/ex.json#name": [{"@value": "Brew Eats"}]
}]

where http://jsonld-example/ex.json is the URI of the JSON-LD document. (I believe this is the real use case of @kidehen.)

@iherman
Copy link
Contributor

iherman commented Mar 12, 2018

Looking at the issue again in general: we should be very careful to introduce new features based on use cases only. JSON-LD is already complex, and the feedbacks I hear time-to-time is that it is already too complex.

My understanding of the use case of @kidehen and @pchampin is that they need a way to set @vocab to the document’s URL. I have not heard any statement whereby they need to set it to @base. Would it be enough to introduce a feature that does "just" the setting of the @vocab to the document's URL and stop there? The pull request shows that the combination with @base is more complex than necessary...

@niklasl
Copy link
Member

niklasl commented Mar 12, 2018

As I mentioned in #488 (comment):

Mightn't this problem be resolved by simply allowing @vocab to be resolved against @base (instead of requiring it to be a full IRI)? From the examples given, I find that satisfactory, simpler, and most importantly we won't have to tamper with the value space of keys in surprising ways...

That is: using "@vocab": "" could be allowed, and the "" (and more usable values such as "#" or "./") would be resolved against the base IRI (i.e. the document or it's explicit @base).

@gkellogg
Copy link
Member Author

@niklasl wrote:

That is: using "@vocab": "" could be allowed, and the "" (and more usable values such as "#" or "./") would be resolved against the base IRI (i.e. the document or it's explicit @base).

That actually might be a superior implementation that would relieve the cognitive overhead of using @base.

@iherman
Copy link
Contributor

iherman commented Mar 12, 2018 via email

…t of using the value of `base IRI` at the time of the definition in the context for `@vocab` and reverts to using normal string concatenation semantics.
# Conflicts:
#	test-suite/tests/compact-manifest.jsonld
@gkellogg
Copy link
Member Author

@iherman I believe the text now satisfies your requirements:

  • We now use the empty string ("") instead of @base.
  • There is a specific processing step in the context processing algorithm to handle this case.
  • The WebIDL section is now normative.

@iherman
Copy link
Contributor

iherman commented Mar 13, 2018

@gkellogg

  • In section 4.3.1 the two examples should carry the same title
  • In section 4.3.1 it would be worth having a separate example without using the @base at all. The value of @vocab would be set then to the URL of the document itself...
  • In section 6.11, the definition of @vocab in the 5th paragraph: my understanding in the latest round was that the value of @vocab cannot be @base. Did I get it wrong or is the presence of @base there only a leftover?
  • The WebIDL interface in the API: Looking at https://rawgit.com/json-ld/json-ld.org/vocab-base/spec/latest/json-ld-api/index.html#the-application-programming-interface the WebIDL is still not normative. Do I miss something?

@gkellogg
Copy link
Member Author

In section 4.3.1 the two examples should carry the same title

Part of the automated sanity testing is that no two examples have the same title, but the title of the second should be changed to 'Using "" as the vocabulary mapping (expanded)'

In section 4.3.1 it would be worth having a separate example without using the @base at all. The value of @vocab would be set then to the URL of the document itself...

Kingsley had a question about this; rather than bloat with further examples, perhaps I'll just change this one to eliminate @base and describe the document location in prose.

In section 6.11, the definition of @vocab in the 5th paragraph: my understanding in the latest round was that the value of @vocab cannot be @base. Did I get it wrong or is the presence of @base there only a leftover?

Your understanding is correct, and this is an oversight.

The WebIDL interface in the API: Looking at https://rawgit.com/json-ld/json-ld.org/vocab-base/spec/latest/json-ld-api/index.html#the-application-programming-interface the WebIDL is still not normative. Do I miss something?

No, I missed an extra class="informative" in subsections.

@iherman
Copy link
Contributor

iherman commented Mar 13, 2018

@gkellogg with the changes in the last comment, I am fine merging!

@TallTed
Copy link
Contributor

TallTed commented Mar 13, 2018

Reiterating comment from @kidehen on issue #488, as we read it, this PR adequately addresses our concerns.

@gkellogg gkellogg merged commit 6f8feb0 into master Mar 13, 2018
@gkellogg gkellogg deleted the vocab-base branch March 13, 2018 19:22
kazarena added a commit to piprate/json-gold that referenced this pull request Dec 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants