Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor TripleStoreService to be compatible with NextGraph #1263

Open
srosset81 opened this issue May 29, 2024 · 15 comments
Open

Refactor TripleStoreService to be compatible with NextGraph #1263

srosset81 opened this issue May 29, 2024 · 15 comments

Comments

@srosset81
Copy link
Contributor

srosset81 commented May 29, 2024

The current TripleStoreService interacts directly with the Fuseki API. We will keep this service, but we will create another one which will interact with the NextGraph API and take into account its particularities (such as sessions handling, and URIs transformation)

https://git.nextgraph.org/NextGraph/nextgraph-rs/src/branch/master/ng-sdk-js/app-node/index.js

  • Rename TripleStoreService to FusekiTripleStoreService
    • Remove references to "secure dataset"
    • Remove 'X-SemappsUser' from Fuseki calls
  • Create a NgTripleStoreService (also named "triplestore", and with the same actions)
  • Sessions handling
    • It's better if sessions are not opened/closed on every request, as there are some informations that are put in memory. But it's also better if the sessions are not kept forever open, to free up that memory.
    • Every time we access a dataset, get the session ID associated with this dataset (NextGraph will open the session if it is not already opened)
    • Keep track in memory of the latestAccessTime of each session
    • Every 24h (for example), setup a cron to look for sessions which have not been used in the past 24h, and close them.
  • TripleStoreAdapter
    • The adapter should still work with NextGraph. The dataset will need to be the UUID provided by NextGraph so, for the AuthAccountService, we will need to store somewhere this UUID.
@srosset81
Copy link
Contributor Author

srosset81 commented May 29, 2024

I removed this part:

  • Transformation of did: and https:// URIs
    • In the NextGraph triple stores, all resources will have a did: URI, with their UUID. It will be up to the NgTripleStoreService to transform them to https:// URIs.
    • WebIDs will be the only resources that have a different mapping between HTTPS and DID. We will use the settings dataset to safely find the resource UUID associated with the WebID (we could also do a SPARQL query, but then anyone could create a document that looks like a WebID, and it would be a security breach).
    • The transformation could also be done in the LdpService, or by some NextGraph-specific middleware. This way we would transform the resource rather than the query.
    • A simple way could be to reframe the resource with a JSON-LD context with a @base directive. Then change the @base directive to the new URI we want.

@nikoPLP is going to handle this transformation between DID and HTTPS URIs in the NextGraph API, this way the SPARQL query will always return the HTTPS URIs, and not the DID URIs. This will probably be an option to the SPARQL query API. When we store data, it will detect DIDs in the URIs, and store only the DID part.

For this to work, it will be important to provide the related HTTPS URI when we create a new document. It will be saved in the "metadata" of the NextGraph document, using the ng:e ("e" for external) predicate. This is how NextGraph API will be able to find the HTTPS URI from the DID URI. Niko will show us how to do that with his API.

This ng:e will also be a simple way for SemApps to find the document linked with a LDP resource or container. It will also be possible to use it to find the WebID resource UUID from the WebID URI. But speaking today with Niko, it will be a problem with remote WebID, because the NextGraph API will not be able to fetch the WebID, and will thus not be able to know the related DID. So remote WebID will be stored with a HTTPS URI. We haven't found a solution for WebID yet. This is also why we will keep on using UUIDs for containers and collections, even though we would have preferred keeping this as an option.

@Laurin-W
Copy link
Contributor

I'm not sure if I fully understand yet.

So every URI that looks something like https://mypod.store/laurin/data/did:uid-1 will be mapped to did:uid-1 internally? And did:uid-1 is the named graph / resource which will contain a triple <did:uid-1> <ng:e> https://mypod.store/laurin/data/did:uid-1> . ?
And when I want to make a reference from named graph / resource https://mypod.store/laurin/data/did:uid-1 to https://mypod.store/laurin/data/did:uid-2, the triple <https://mypod.store/laurin/data/did:uid-1> <ex:reference> <https://mypod.store/laurin/data/did:uid-2> will be mapped to the triple <did:uid-1> <ex:reference> <did:uid-2> . internally?

This ng:e will also be a simple way for SemApps to find the document linked with a LDP resource or container

So this is in order to restore the prefix https://mypod.store/laurin/data/ when taking the example from above? Or is this supposed to enable support for arbitrary mapping of ldp resources to internal did graph names? (Why) do you want the webId to to consist of human readable parts (no did) only?

NextGraph API will not be able to fetch the WebID

What does NextGraph need to fetch a remote webId for?

@nikoPLP
Copy link
Contributor

nikoPLP commented May 30, 2024

So every URI that looks something like https://mypod.store/laurin/data/did:uid-1 will be mapped to did:uid-1 internally? And did:uid-1 is the named graph / resource which will contain a triple <did:uid-1> <ng:e> https://mypod.store/laurin/data/did:uid-1> . ? And when I want to make a reference from named graph / resource https://mypod.store/laurin/data/did:uid-1 to https://mypod.store/laurin/data/did:uid-2, the triple <https://mypod.store/laurin/data/did:uid-1> <ex:reference> <https://mypod.store/laurin/data/did:uid-2> will be mapped to the triple <did:uid-1> <ex:reference> <did:uid-2> . internally?

exactly ! and this will be transparent to ActivityPods. with the APIs I give you, you can use the https://... version everywhere and you will also get in return (in SELECT/CONSTRUCT) everything with https:// . The mapping and prefix removing/adding will be done in nextgraph.
The purpose of all that mapping is to enable complete interoperability with NextGraph and also to give portability. If the user moves their data from one POD to another, the URIs internally will not change. only the ng:e will have to be updated (I will provide an API for that too if needed). Eventually, if the user wants to use the nextgraph apps to access and modify their data, they will be able to do so. And if they want to use only nextgraph and stop to use the activitypods server (or use activitypods server for only one part of their data, which will be possible to do in the future), then it will all work smoothly.

This ng:e will also be a simple way for SemApps to find the document linked with a LDP resource or container

Here Seb wanted to say "to find the namedGraph" as documents and namedGraph are the same concept in nextgraph. But in any case the namedGraph will always be the exact same URI than the subject of all the triples of the RDF resource. Except if you add some "foreign" triples in the document, that have for subject some other URI, like by example if you want to establish facts about another resource, within your document (something that semapps cannot do for now but that Solid allows). In this case, if you want to know the namedGraph of such triples, you will have to look at the graph part of the quad, that is always returned to you, and that will be the document URI.

So this is in order to restore the prefix https://mypod.store/laurin/data/ when taking the example from above? Or is this supposed to enable support for arbitrary mapping of ldp resources to internal did graph names? (Why) do you want the webId to to consist of human readable parts (no did) only?

The <ng:e> predicate will be available for read access in sparql_query API, but not for update. it will be used internally by nextgraph as explained above. it is not supposed to enable support for arbitrary mapping to some custom URL (made of paths with slug by example). we talked about that in the last days with Seb, and it is better if we confine the use of ng:e to the prefix mechanism. In any case, it is also wiser I believe, that all URIs have an UUID (a did: part) and do not try to use paths and slugs like in a URL, because those URLs are not durable. as soon as you change the name of a container or slug, the whole URL is invalidated and you need to put a 301 for it. While if all your resources and containers just have a did: identifier, then you can move them around, the URi doesn't change.

And then, we understood that if containers and resources are all with did:, the only URLs that remain in ActivityPods are the WebIds. And we asked ourselves if this was good or bad. As you say, we could probably also use the did: identifier for the WebId, and then, everything is harmonized.

But this brings some other problems, and we should probably open a special issue just for that, as the solution is not trivial.

NextGraph API will not be able to fetch the WebID
What does NextGraph need to fetch a remote webId for?

The webId is a bit special because it should be easy to read by the end-user, as far as I understood from Seb.
If we keep the webId as it is for now ( https://mypod.store/sro ) and an incoming activity arrives in a server with nextgraph backend, then nextgraph will not be able to understand, just by looking at the URL, what is the associated did: document, and will not be able to do the mechanism we just described above. It breaks the interoperability, because webIds are documents too, that can be edited by the end-user, and in this scenario, they won't be able to do so.
You can always store arbitrary https:// URLs in nextgraph, but you need to know in which document to put them. in the present case, we won't be able to know in which document. (document is the namedGraph, and it must be a did: that was previously created with a special API).

So... in order to solve this, Seb said, maybe nextgraph can fetch the webid from a remote server in https GET, and there will be a triple there that will say what is the did: of the document, and then nextgraph will be happy. but in fact, nextgraph won't be happy with this solution, because we have to respect the separation of concerns. nextgraph is just the quad store. it is aware that the activitypods is a bit special and has a special API (called headless) and it helps a bit with the ng:e to map to https urls. but it cannot start to interact with plain old https servers. this is the responsability of activitypods, that is doing all the bridging with the https world.

In any case, it is not very efficient to have to fetch a remote resource every time you want to process and understand a URL that arrived and that you need to store. What should be done instead is that the URL itself contains all the information needed in order to process it.

This is why the idea of using the format <https://.../did:...> for webIds too, seems more interesting according to me.

So we have to address the question of what exactly is a WebID.
I understand that the main tenant of ActivityPod is "one webId is one AP actor".
So we not only have to make the Solid WebId match the did: but also the AP actor match the did: .
we are in a triangle here.

To add a bit more of complexity (why not!), I just realized this morning, that for the case of the User identification, in nextgraph, and in general i believe with DID, we have 2 identifiers for the User. one is the DID that uniquely identifies the User, the other is the DID that represents the document where we can store some information about such user. And I would argue, looking at your architecture, that there is a third DID, which is another document that holds some private information about the user, while the public document has some "system" information like the inbox etc..
As explained here by Seb:

The WebID/ActivityPub actor is viewable by everyone, because that’s needed for ActivityPub to work correctly. However the contacts app creates the user’s profile on another ressource (with the class as:Profile and vcard:Individual)

So... we would have 3 DIDs.

  • the DID of the User, is not a document! it cannot have triples, and its format is did:ng:i:xxx, in opposition to the document DIDs that have the format did:ng:o:xxx (i or o). it is the same ID that is used for the name of the dataset in the headless APIs I give you. This identifies the POD, but cannot have triples, as it is for now. But I can maybe change that, and create a special document for this DID, that will allow putting some "system" triples in it. The best would be that this document is not editable by the end-user. By the way, this special DID is the one that can be used for authentication in the classical DID mechanism as opposed to document DIDs that cannot do authentication.
  • if we don't do the modification that would allow to put triples in the did:ng:i:xxx , then we need a special document created specially for the public/system triples of the webId.
  • and then there is the private document holding other triples about the user’s profile, like the contacts app is doing.

We can probably reduce this to 2 only, if I implement the option to put triples in the did:ng:i:xxx.

But that doesn't solve the main problem.

The problem is only for webId coming from other servers, because otherwise, on the local server, we know very well the mapping between DID and webId.

I proposed that a webid URI could be of the form https://mypod.store/sro/did:ng:i:xxx this way we have both the slug of the username in it, and the information about the document in the suffix. Nextgraph can save a ng:e about this webid in order to save it without the prefix, and add the profix for all outputs. If activitypods needs to also keep locally in the triplestore all the system/public triples of that remote webid, in order to operate well, then I will give you a special API to do so (to write system triples to the did:ng:i document, as i proposed above) this would be like a local cache of the webid public information, that has been fetched from the https remote server (by activitypods, not by ngd) and that is saved in the triplestore.

Those triples should also contain a link to the private user profile URI, because it is not the same one as the webid. I dont know if you already do that with the separation of public and private profile.

remains the question of the "presentability" of such webId with a long unique identifier in it.
I wonder when exactly in your apps and workflows do you present the webId in some GUIs ?
in any case, you already have to extract the username slug from https://mypod.store/sro and obtain only "sro".
The same could be done with the webID format I propose, it is easy to get only the "sro" part of it.
For solid, it could be also easy to "help" applications find this username slug by using the foaf:name predicate in the webid resource, as Solid specs say :

An app SHOULD look in the user's WebID Profile for the foaf:name predicate, and use that as the name, if it's available.
If an app does not find a name in the user profile, it MAY fall back to using the WebID URL, or a part of it, as the username.

For ActivityPub agents, I don't know exactly how they use the WebId in order to display a nice username slug. I don't know neither how the Solid-OIDC is converting from the URL to a username. But I guess that was is done from now (take the first part of the URL path as the username) can be easily extended into : take the first part of the path and ignore the did part.

I guess ActivityPods users, and also Solid users or ActivityPub users, do not carry with them their full https:// webId and only use their username slug in order to connect. So the underlying format of the webId URL, is hidden from them anyway.

Finally, there are some ways that were researched in order bind a WebId to a DID, and an actor to a DID, in both the Solid community and the AP community. it isn't exactly what we want i guess, because it isn't expressed inside one URI. But it would be interesting and good to provide compatibility with that, so the DID of nextgraph can play well of other DIDs and with the whole ecosystem of Solid and AP.

here are some links:
AP
https://socialhub.activitypub.rocks/t/nomadic-identity-for-the-fediverse/2101/65
https://arcanican.is/primer/ap-decentralization.php
https://socialhub.activitypub.rocks/t/autonomous-identity-for-the-pluriverse-based-on-oauth-oidc/3675
Solid
https://github.com/interop-alliance/life-server is an example of a Solid server that was using DID, we could have a look at what they did in regards to the WebId.
solid/specification#217

Anyway, we cannot solve this question right now. But we will work on it later when we starting implementing all the integration with nextgraph.

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 6, 2024

And we would definitely have to have a look at :
https://codeberg.org/fediverse/fep/src/branch/main/fep/ef61/fep-ef61.md

@Laurin-W
Copy link
Contributor

Laurin-W commented Jun 6, 2024

Thanks for your thorough answer! Since we will be talking about it again later, I'll keep my comment short and will ask my open questions later. I'm very happy to keep in discussion about this :)

I wonder when exactly in your apps and workflows do you present the webId in some GUIs?

Right now, and this is a general convention in ActivityPub, users are usually preferrably identified with a handle like @laurin@mypod.store which is queried using webfinger https://mypod.store/.well-known/webfinger?resource=acct:laurin@mypod.store.
The other method we have in APods specifically are invite links with a capability URI (a basic version, no zcaps) that you can send to someone so that they can connect immediately.

Also, I wonder if a 30X redirect from the basic webId to the webId with a did: appended would do the trick for most cases too..

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 6, 2024

I would be glad to read your longer version and open questions !
I thought AP was using "follow your nose" (dereference URLs) and not so much WebFinger
mastodon/mastodon#1557
But that answers my question about when do you display username slugs to end-users in the GUIs.

Another question I have: When you receive an Activity in APods, coming from an actor that is not on your server, do you fetch the resource describing the actor (that contains the inbox URL etc)? do you store it locally?

@simonLouvet
Copy link
Contributor

  • Remove 'X-SemappsUser' from Fuseki calls

why remove 'X-SemappsUser' . How fuseky will check WACL without user?

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 6, 2024

Hello Simon,
Je te réponds en français et puis apres je traduirai en anglais.
On va refactorer sérieusement SemApps et ActivityPods pour préparer l'intégration avec NextGraph et avec Fuseki 5.
Il y a plein d'issues qui ont été créées récemment a ce propos, et notamment une qui explique que les WACs vont etre gérées en dehors de Fuseki, c'est a dire directement dans le middleware. C'est la seule maniere de pouvoir passer a Fuseki/Jena 5.
La consequence c'est que pour le moment, les WACs ne seront plus gérées par Jena, et donc ne seront plus vérifiées dans les requetes SPARQL. tout mon travail d'il y a 4 ans, sera perdu.
Je sais que tu utilises encore SPARQL avec les WAC, et c'est tout à ton honneur.
Je te rassure : NextGraph permet de gérer les permissions dans SPARQL, donc a terme tu auras une solution pour faire du SPARQL avec des permissions, vu que SemApps va etre basé sur NextGraph (ou Jena 5, au choix).
Si tu preferes Jena5, alors il faudra reimplementer les WAC en Java dans Fuseki. Je ne vais pas refaire mon plugin d'il y a 4 ans. A toi de voir si tu veux te lancer la dedans.
Par contre si tu choisis NextGraph comme triple store, alors tu auras les permissions gérées automatiquement dans le SPARQL. La différence c'est que les permissions dans NextGraph ne sont pas basées sur des ACLs mais sont gérées avec des capabilités. Nous étudions pour le moment comment va se faire la transition entre WAC et capabilités.
Il y aura peut etre un systeme pour faire marcher les WAC dans NextGraph, mais ce n'est pas encore sur.
Mais les capabilités ne sont pas tres différentes. Il y a des droits en lecture et en écriture, sur chaque resource.

Bref, ce sont de gros changements, qui vont etre implémentés vers la fin de l'année, si tout va bien avec nos demandes de subventions.

Apparement tu n'étais pas au courant.
Je veillerai à ce que tu sois le moins impacté possible. Tu peux aussi rester sur la version actuelle de SemApps, et tu n áuras aucun besoin de changer quoi que ce soit.
Mais il faut bien se rendre compte que la situation actuelle n'est plus tenable, avec le code des WAC dans Jena qui nous bloque à la version 3.5, et tous les problemes de compactage et perfs que ca engendre.
Donc passer a Jena 5 est une bonne idée.
Proposer NextGraph comme tripelstore est aussi une bonne idée.
Tu auras le choix.


English

We will seriously refactor SemApps and ActivityPods in order to prepare for integration with NextGraph and Fuseki 5.
There are many issues that have been created recently about that, and specially one that explains that WAC will be soon managed from the middleware, without the plugin I did 4 years ago for Fuseki. That's the only way we can pass to Fuseki/Jena 5.
As a consequence, WACs will not be handled by jena anymore, and that means SPARQL requests won't be protected anymore by the permission mechnism I had done 4 years ago.
I know you still use SPARQl queries with WAC, and that's a good thing!
There is one solution: NextGraph can manage permissions and enforce them in SPARQL, so eventually you will have a new solution to do SPARQl qith permissions, as SemApps will be using NextGraph as a triplestore (or Jena5, at your option).
If you prefer to use Jena5, then you will have to reimplement the plugin I did back then, in Java. I won't do that myself.
If you chose NextGraph backend instead, then you will have permissions in SPARQL, but they are based on capabilities, not on ACLs.
We are still studying how we could map WACs to capabilities, and it isn't sure we will be able to do that.
But using capabilities is easy, as it has the same concept of Read and Write permission for each resource.
So... those are big changes indeed, that will be implemented at the end of the year if everything goes well with our grant applications.
It seems like you were not aware that we were working on those big changes.
I will check that your transition to the new SemApps is as smooth as possible.
You can also decide to stay with the current version of SemApps.
But you must know that the current situation is not durable, because the plugin I did 4 years ago about WAC, is blocking the version of Jena to 3.5 and has some performance impact.
It is a good idea to move to Jena 5.
And I think giving an option to use NextGraph as a triple store in SemApps is also a good idea.
You will have the choice.

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 8, 2024

Bonne nouvelle @simonLouvet je pense avoir trouvé un moyen de faire le mapping entre WACL et les capabilités de NextGraph ! Donc pour le meme prix tu auras les requetes SPARQL protégées par ACL si tu utilises le backend NextGraph.

I think I have found a way to map the WAC permissions to NextGraph capabilities! So it will be easy to maintain the WACs in the middleware and pass some capabilities in the SPARQL requests in order to enforce permissions in NextGraph quad-store!

@srosset81
Copy link
Contributor Author

That's great to hear @nikoPLP !!!

Regarding the DID-HTTPS mapping, I understand that no HTTP fetch should be done by NextGraph API, but I'm thinking this could be done by async functions that are passed to NextGraph API on start. Something like:

nextgraph({
  transformDidToHttps: async didUri => (),
  transformHttpsToDid: async httpsUri => ()
})

This way it would be the responsibility of ActivityPods to do the conversion, and eventually to fetch a remote resource to find its DID. It would be much easier than if we have to do it on the SPARQL query or the SPARQL results.

Apart from WebID, we will also have a similar problems with the Pods root containers (eg. https://mypod.store/sro/data)

If this transformation worked well, then we could even consider allowing LDP containers with custom paths (because I think if we don't allow this, it's going to be difficult to be really compatible with Solid apps).

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 10, 2024

For the paths, we will see, I agree it would be cool to have them. But we have to work on it.

About the problem of the https fetches, I still need to understand better the cases.

I want to take again the example of an activity that arrives on your server, from another server (a mastodon instance by example) and it is the first time you receive an activity from such actor.
We are asking ourselves what to do with "unknown" actors/webids.
In this example, if we receive an activity from this actor, it is because we subscribed to it. So we already "know" this actor, somehow.
So I guess we are talking about 2 other cases:

  • when a webid appears somewhere in the content of an activity or resource, and we have never seen this webid before.
  • an activity is "pushed" to us, without having subscribed first. is that possible in ActivityPub protocol ?

What do you do for now in APods, when you have to deal with "unknown" or "foreign" actors/webids ?

I guess you fetch their URL first (dereference, follow your nose) and you store locally the triples about this webid (the public profile). Am I right? This way, you know all the "system" info about this actor (inbox, outbox, etc...).

How can it be then, that a webid or actor appears somewhere in a resource/activity, without the server having known about it before?

Please explain the cases when this happens, and how you deal with them for now in APods.

@srosset81
Copy link
Contributor Author

srosset81 commented Jun 11, 2024

In this example, if we receive an activity from this actor, it is because we subscribed to it. So we already "know" this actor, somehow.

No, anyone can send you a direct message through ActivityPub. It can also be a first request to follow you.

So I guess we are talking about 2 other cases:

  • when a webid appears somewhere in the content of an activity or resource, and we have never seen this webid before.
  • an activity is "pushed" to us, without having subscribed first. is that possible in ActivityPub protocol ?

There is an infinite number of cases when we can see an unknown WebID or activity.

What do you do for now in APods, when you have to deal with "unknown" or "foreign" actors/webids ?

I guess you fetch their URL first (dereference, follow your nose) and you store locally the triples about this webid (the public profile). Am I right? This way, you know all the "system" info about this actor (inbox, outbox, etc...).

No we don't store the WebID locally. Why would we need to do that ? We store contact profiles because it makes it easier to list (and filter them) but this is only for performance reasons, we could also do without this. When we want to post an activity to someone, we fetch the WebID and find the inbox. We used to have some (Redis) caching for that, we will probably add it back in the future, but it's not a real performance issue at the moment.

How can it be then, that a webid or actor appears somewhere in a resource/activity, without the server having known about it before?

I think you are reducing too much the usage of ActivityPub. There is really no way that we can know about all WebIDs or activities before hand. That's why we need some way to be able to find the DID of remote resources. My suggestion was one way to do it. Romain suggested to have some kind of "mapping servers" (like IPFS). I think both solutions could work and would allow a real bridge between NextGraph and the Fediverse.

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 11, 2024

Thanks for the answers. I just asked questions in order to understand the use cases, not to "reduce" anything.

About the solution to this issue, as we have said before, nextgraph needs to have all the information about the webid and the DID in the same URI.

There were several solutions that popped up recently, but few are really addressing the issue.

We cannot "scan" all the activities and replace the webid URIs in them, with a DID or append a DID to it, because we cannot know what is a webid and what is not.
So any solution that is "caching" or "fetching on demand" are not applicable.

WebIds can come from anywhere, and can have any format. Also, they can appear anywhere in the triples. There is no way to detect them.

What we are trying to deal with, in a specific edge case when :

  • the webId is an ActivityPods' WebId that has a Nextgraph backend.
  • and
  • this webid was not created on the same ActivityPods server that is receiving it and has to deal with it.

The only way I see to deal with that case, is to have the DID be a part of all webIds that are coming from a server with NextGraph backend.

Which form exactly should this URI containing both a "classical" webId and a DID, have, I don't really care. But both information has to be included in the same URI.

I don't see yet why having a DID appended at the end of a webId is a problem for the normal functioning of both ActivityPods or any other ActivityPub server.
WebIds (actor URIs) can have any format, as far as I understood.

I don't know all the internals and specs of ActivityPods nor ActivityPub in general. So I am just asking where would there be any problem with appending the DID at the end of the WebId.

Let's consider this option please, and analyze together if it is doable or not. If it brings more problems to you, we will think again. But so far I didn't understand what exactly prevents us from appending a DID at the end of the WebId URI.

BTW, adding the DID to the URI is exactly what this spec from ActivityPub is doing :
https://codeberg.org/fediverse/fep/src/branch/main/fep/ef61/fep-ef61.md

So in any case, we can always default to that format. But I am not sure that using such scheme (ap://) would be good for all the Solid side of the problem.

For the SOlid side, we can always add a foaf:name triple to help aplications find the username, as already quoted:

An app SHOULD look in the user's WebID Profile for the foaf:name predicate, and use that as the name, if it's available.
If an app does not find a name in the user profile, it MAY fall back to using the WebID URL, or a part of it, as the username.

@Laurin-W
Copy link
Contributor

I'm not 100% sure if I understand the topic at discussion here.
So you are both saying, foreign WebIds are not a techincal issue, no matter if NextGraph or not. We just fetch them (and maybe cache).
So we are left with local WebIds. This is a readability/UX problem (long webIds with unreadable did stuff) which could be handled by (1) a 30X redirect to the 'full' webId (will require some settings dataset for mapping) or (2) something NextGraph or (3) APods/SemApps middleware specific (or a combination thereof).
Am I missing something here?


My second questions: @nikoPLP When you store a triple like <did:uid-1> <ng:e> https://mypod.store/laurin/data/did:uid-1> in every resource / named graph. Why do all my pod's resources need to have the same prefix https://mypod.store/laurin/data and why can't that prefix be something custom, so as to allow paths like https://mypod.store/laurin/data/sub-path/did:uid-2 (at least that's how I understood your comment about the technical difficulties)? :)

@nikoPLP
Copy link
Contributor

nikoPLP commented Jun 11, 2024

Hello @Laurin-W thanks for joining us again on this topic! It is a bit complex, but I hope we will all be able to describe the problem accurately and find a solution that fits everybody.

About the second question: <ng:e> is just a mapping tool that internally replaces one URI by another, in and out, everytime the headless API is used for SPARQL queries and updates. This is needed because internally, nextgraph can only understand plain DID URIs, while ActivityPods only understands http URLs.
The mechanism is transparent to you. all the URLs you manipulate are going to be starting with http.
And this is why all HTTP URIs that are in fact a NG document, need to have a DID at the end, so that nextgraph can find it.
You can put whatever you want in the http prefix (the left part of the URI). you are not constrained. you can include some slugs, paths or whatever. As long as the trail of the whole URI has a DID.
This can be used for containers, and for webIds.
I will provide you with a special API in order to set/change the ng:e meta predicate (because it is not updatable with sparql).

That's for one part.

Now the other problem that we haven't solved yet, is about WebId/Actor URLs.
I mean, for me it is kind of clear that those URIs should be exactly the same as anything else : you can put what you want on the left part, and on the right part, there should be a DID.
If we do that, then everything is ok. no problem with foreign nor with local webids. all is the same. if the webId was generated by nextgraph, it has a DID at the end. if it is a webid that comes from another AP server that isn't using nextgraph, it doesnt have a DID part and is treated internally by nextgraph as "web2.0 URL". In this case, it means that it is not a resource that can be updated in the triple store (which is normal because we shouldnt have a document about this foreign webid anyway, except if we need to cache it, but as Seb explained, this is another subject that is not the problem we are dealing with right now). NextGraph can have both URLs and DIDs inside its quadstore. But all the triples have to have a named graph, and the named graph must be a DID (of a document). So when you add some triples, it has to be always within a DID/document.
The local webids (those of the local users of APods+NG) will have a document where they can store some triples about the user (inbox, outbox, etc... i think you call this document the public or system profile). this document has a DID. it can be fetched from anywhere on the internet with the URL https://mypod.store/sro/did:ng:i:xxx that APods exposes, and that NG internally deals with properly thanks to the ng:e mechanism.
Foreign webids are stored in the quadstore of NG "as is". but of course, a unique WebId should not have 2 URIs. So if it is a nextgraph-backed webid, it should have the format https://mypod.store/sro/did:ng:i:xxx, regardless of if it is foreign or local.
I don't think we need redirects neither (except maybe if some webids already exist and are used out there on the fediverse, and they dont have the DID, so we need to create those redirects for them). But Seb was saying that for now, the usage of APods was mostly "internal", meaning, all users stay within the same APods instance.

I find this solution simple. Please let me know what you see as problems that could prevent us from using this solution.

About the readability of the WebId, I asked several times where this could be a problem. And the answer was: we use webfinger for all GUI related usernames, so the WebId does not appear in GUIs. So i don't know where the problem persists.

Again, I don't know all the details of your protocols and use cases, so if I make a mistake or miss-understand something, please just explain me.

And then when I was researching what others are doing regarding DID and Solid and AP, I found several specs, and we could also try to implement that later on. But that's just optional. it would provide some compatibility with other DID mechanism maybe... it isn't clear what would be the immediate benefit. We will have to see who is using DID in the Solid and AP world, and if we want to have some cross compatibility with that or not.
For now, we just try to have NG work well with APods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants