Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of vocabulary-scoped identifiers in @values #77

Closed
Peeja opened this issue Sep 28, 2021 · 9 comments
Closed

Use of vocabulary-scoped identifiers in @values #77

Peeja opened this issue Sep 28, 2021 · 9 comments

Comments

@Peeja
Copy link

Peeja commented Sep 28, 2021

This looks like a bug to me, but I also could easily be misunderstanding the intended semantics. I'm not sure how to characterize it, so by all means, please retitle this issue if you can put it more usefully. 😃


Given the following data:

[
  {
    "@id": "http://www.example.com/foo",
    "aProperty": "existing-value"
  },
  {
    "@id": "http://www.example.com/bar",
    "aProperty": "another-existing-value"
  }
]

I expected the query:

{
  "@construct": {
    "@id": "?id",
    "?property": "?value"
  },
  "@where": {
    "@graph": {
      "@id": "?id",
      "?property": "?value"
    },
    "@values": [
      {
        "?id": "http://www.example.com/foo"
      }
    ]
  }
}

to return:

[
  {
    "@id": "http://www.example.com/foo",
    "aProperty": "existing-value"
  }
]

but instead it returns []. (playground)

A similar @select query also yields []. (playground)

For what it's worth, an empty @values successfully returns all of the data. (playground)


The use case here is that I'm trying to read all of the existing triples for properties that I'm about to write, so that I can @delete the old values when I @insert the new values. I should be able to do the same thing with a bunch of reads and put them together, but that seems odd (but might be what I'll do for now, at least).

@gsvarovsky
Copy link
Member

The problem is that a plain string like "http://www.example.com/foo" is interpreted to be a literal, which will never match an @id. Instead, you need to use an explicit Reference, like this:

...
    "@values": [{
      "?id": { "@id": "http://www.example.com/foo" }
    }]
...

There are only a few contexts in which a plain string will be interpreted as an IRI:

  • properties
  • @type
  • @id itself

In some cases you can also use a @context to make something an IRI when it's within a Subject. But that never applies to @values, because it supports the bound variables appearing anywhere else in the rest of the @where clause.

Let me know if this makes sense and fixes the issue for you. In the meantime let's leave the ticket open while we decide the best way to emphasise this in the documentation.

@Peeja
Copy link
Author

Peeja commented Sep 28, 2021

Ah, of course. That works great.

Okay, round 2: I'll need to also match the property name with @values, and those are also IRIs. But where a string aSubject given as an @id is coerced to http://the.domain.example/aSubject, and therefore can be seen and used in relative form as aSubject, a string value aProperty given as a property (that is, as an object key) is coerced to http://the.domain.example/#aProperty, which means its relative form is #aProperty. That makes it hard to work with, because I can't just put { "@id": theKeyFromTheObject } in the @values: it doesn't have the #. But I also can't say { "@id": `#${theKeyFromTheObject}` }, because that would break if theKeyFromTheObject were already an explicit, absolute IRI.

Is there a facility for normalizing an object by applying the context that m-ld will be using? That would probably be easier to work with.

@gsvarovsky
Copy link
Member

gsvarovsky commented Sep 28, 2021

Ah, that's awkward. Leakage of the # into the API is definitely a Bad Thing, especially for users without prior RDF experience. I would even prefer that it's not necessary to know the difference between document- and vocabulary-scoped identifier positions. Ideas:

  1. Facility to normalise (expand) a Subject. This would be a double-spend, as m-ld is already denormalising (compacting) it, so:
  2. Facility to disable context-based compaction for read results. This would help in your scenario but would hurt anyone who was using theKeyFromTheObject to do anything in the application, like adjusting an in-memory representation; especially if they didn't generally care about IRIs.
  3. A syntax for specifying vocabulary expansion in @values and other references, for example using { "@vocab": theKeyFromTheObject }. This would be kinda neat. It violates my second API principle but it's relatively easy to explain: "do this for properties and types". It does leave some other awkward edge cases though, like matching a variable in both a document and a vocabulary position. Edit: Wait, no, that would be fine, I think

Hmm

@Peeja
Copy link
Author

Peeja commented Sep 29, 2021

Ah, okay. I think what I was missing was the different meanings of @base and @vocab in JSON-LD.

What I'm working on here is a function upsert() which takes Subject[] and returns a StateProc which performs an upsert for properties which should be single-valued, deleting any existing values when the new ones are inserted. To do that, I have to take property names from the property position and use them in the object position in @values. Now that I get how to do that, I see that I can actually get it done in a single write:

await state.write({
  "@insert": subjects,
  "@delete": { "@id": "?id", "?property": "?value" },
  "@where": {
    "@graph": {
      "@id": "?id",
      "?property": "?value",
    },
    "@values": [
      {},
      ...subjects.flatMap((subject) =>
        Object.keys(subject)
          .filter((key) => key != "@id")
          .map((key) => ({
            "?id": { "@id": subject["@id"] },
            "?property": { "@id": `#${key}` },
          })),
      ),
    ],
  },
});

But I still need to translate from @vocab-relative to @base-relative (which I've temporarily done here by assuming the default context). Ideally, I'd be able to use the same logic m-ld will be using to translate that name. I think ultimately that means (essentially) expanding the subjects, no? I'm not sure how idea 2 would solve that, and I don't quite follow idea 3.

Edit: Actually, it looks like the single write isn't working as well as I thought it would, so I'm back to a read and then a write. But in any case, the relevant point stands.

@gsvarovsky
Copy link
Member

Option 3 replaces

"?property": { "@id": `#${key}` },

with

"?property": { "@vocab": key },

The @vocab key replaces the @id key of a Reference but tells the processor to use vocabulary resolution for it.

This actually offers a fix for the equivalent problem in JSON-LD.

@gsvarovsky
Copy link
Member

single write isn't working as well as I thought it would

Somewhere along the way this query has lost its @union, which means the @graph results (everything in the domain!) are being joined with the @values, which will drop the empty binding {}.

For casual observers, see #76 (comment)

@gsvarovsky gsvarovsky transferred this issue from m-ld/m-ld-js Sep 29, 2021
@gsvarovsky gsvarovsky changed the title Odd behavior from @values Use of vocabulary-scoped identifiers in @values Sep 29, 2021
gsvarovsky added a commit to m-ld/m-ld-js that referenced this issue Sep 29, 2021
@gsvarovsky
Copy link
Member

gsvarovsky commented Sep 29, 2021

Suggested option #77 (comment) is now available on the edge:

@Peeja
Copy link
Author

Peeja commented Sep 29, 2021

"?property": { "@vocab": key },

Ah! I'm following now. Yep, I like that too!

Somewhere along the way this query has lost its @union, which means the @graph results (everything in the domain!) are being joined with the @values, which will drop the empty binding {}.

Yes, sorry, I should have been clearer there: I'm specifically trying to avoid this @union, because it grows with the size of the input—then sparqlalgebrajs turns it into a linked list, and then someone (not sure where it is) traverses that recursively, which blows the stack for large inputs (such as my initial data import in my first write()). I switched to a read() followed by a write() to avoid that, and then thought I could still avoid it with the write() above, but I was looking at the wrong tests when I thought it was working.

@gsvarovsky
Copy link
Member

... which blows the stack for large inputs

I created another issue for this. In the meantime, hopefully the need for the @union or additional reads improves with the change of behaviour for @delete-with-variables plus @insert-without-variables, #76 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants