-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor core data model section #67
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May have more comments later, just getting these in to get things rolling.
index.html
Outdated
|
||
<p> | ||
A <a>subject</a> is an <a>entity</a> about which <a>claims</a> may be made. | ||
A <a>claim</a> is statement made by an <a>entity</a> about a <a>subject</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"made by an entity" is not in the model for a claim (see the diagram) so we should omit it, i.e. "A claim is a statement about a subject." This has been one of the challenges between mapping the "claim" language to implementations. Where "claim" is used in implementations is as a specific property of the vocabulary. It is used in a statement about a credential:
The credential C "includes the claim" that subject S is related to object O via property P.
So that looks like this as a graph:
[ C ] --claim--> [ S ] --P--> [ O ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e1f1b82
index.html
Outdated
<img style="margin: auto; display: block;" | ||
width="75%" src="diagrams/claim-extended.svg"> | ||
<figcaption style="text-align: center;"> | ||
Multiple claims may be combined to express a more complex mathematical graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mathematical
is unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about mathemagic instead?
Fixed in e1f1b82
index.html
Outdated
A <a>credential</a> is set of one or more <a>claims</a> about a <a>subject</a>. | ||
It typically includes am identifier to uniquely identify the | ||
credential. Credential metadata may also be included to express things like | ||
when the credential expries. A digital signature is almost always appended by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expries
=> expires
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e1f1b82
index.html
Outdated
|
||
<p> | ||
A <a>credential</a> is set of one or more <a>claims</a> about a <a>subject</a>. | ||
It typically includes am identifier to uniquely identify the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am
=> an
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e1f1b82
Please provide a review of this PR before next weeks VCWG meeting so we can discuss on the call. You can preview the changes to section 3: Core Data Model here: http://manu.sporny.org/tmp/vc-data-model/ The old Data Model section can be found here: https://www.w3.org/TR/verifiable-claims-data-model/#data-model /cc @ChristopherA @kimdhamilton @stonematt @jandrieu @burnburn @dlongley |
Related to the conversation with @dlongley on issue #66, we still have ambiguity about credentials, largely from language relying on the assumption that credentials contain a single claim. My understanding is that credentials as issued are not what is presented to inspectors-verifiers in a profile. In particular, there is a strong desire that holders can selectively disclose claims without revealing entire credentials as issued. This is captured in these requirements:
Defining a profile as a set of credentials does not clearly provide the affordance required for minimal disclosure. Yes, a profile may contain credentials from multiple issuers, but there is no language describing how a single claim from a multi-claim credential is included in a profile. I realize this may be a transitional artifact from the shift from "verifiable claims" as the core focus of the spec to include "verifiable credentials" (a new term). However, there remains some work to align how claims are extracted from credentials and packaged into profiles. As I understand the rewrite, it may make sense to define credentials as "a set of one or more claims which are collectively verifiable". In other words, I think what we are trying to say is that what makes a claim verifiable is the meta-data in the credential in which it is issued. This is a shift from earlier language where each claim could have its own signature, but it's not an unreasonable evolution of the data model. However, what is missing is the notion of slicing & dicing a given credential into something that can be combined into a profile. Previously, we had:
Now we have
Yes, every issuer could break their credentials into multiple, independent subsets of claims, but this is counter to the approach of using zero-knowledge proofs to generate independently verifiable claims from a credential containing many claims. These independent claims would then be packaged into a profile for presentation to an inspector-verifier. My biggest concern is not that we define how zero-knowledge proofs work or require them for credentials. Rather, I think we need to find terminology and a data model that allows the kind of remixing of claims that such technologies enable. The current language--as I attempt to follow the gestalt of the proposed changes--suggests that a credential is the only independently verifiable payload and that any given claim is only verifiable when the credential is verifiable. We need work to clarify the data model if that's correct. We also still need to describe how we handle selective disclosure. |
Example 6: A simple verifiable profile is clearly NOT consistent with the previous definition:
This does not appear to be a collection of credentials. Neither does Example 11: A simple verifiable profile
Finally, the section 7.2.2 Expressing an Verifiable Credential in JSON-LD actually describes a Veriviable Claim:
This still needs some work. The last item feels mostly like simple editorial (completing the shift from verifiable claims through to verifiable credentials). However, the Verifiable Profiles are a bigger issue. Hopefully illustrating a profile that is actually a collection of credentials will illuminate some of my concerns about selective disclosure in my previous comment. |
Yes, we need to add language and examples showing how this could be done. We should show an example with atomized credentials where there is no "editing" to be done and an example where properties from the claim section are omitted -- with a note indicating that the signature type on the credential must support this method of selective disclosure. |
Hey @jandrieu, you wrote a lot :) - but in general, I agree w/ what you're asking for. We need to clean up the prose to align it. To be clear, your new view on the spec has always been our view of the spec... we have just failed to explain it properly in the spec text (you'll see the implementations do track what you describe as your "Now we have" view of things. I'm also noting that there is some miscommunication based on a gap in understanding the technical implementation specifics - that is, things like: you don't need zero knowledge proofs to do selective disclosure, and you shouldn't allow every credential to be selectively disclosed (sometimes two attributes MUST be bound together by the issuer). So, there is much more to talk about, but I do think we can clean up many of the misleading examples you pointed out above. Thanks for the comments, I'll make this my next priority after we get the data model section sorted out. If any other editor or WG member wants to jump in and take a crack at it, that'd speed things up. |
Responding to specifics raised by @jandrieu below...
I think this is a unfortunate misunderstanding. The data model was NEVER ONLY "credentials contain a single claim"... it was always credentials contain one or more claims.
This is correct.
There is a desire to support it. The "strongness" of the desire is dependent on the use case. I should also point out that I am very skeptical that we can implement this at scale in a way that is easy to integrate into most developer workflows. I know others in the group are claiming that it's possible, but I have yet to see a simple zero knowledge proof approach to address this problem. There is too much hand waving going on related to how this will work in the ecosystem, so we really need to get this sorted sooner than later.
Yes, and this is going to be a point of contention in the next several months as there are several ways to do this, ranging from selective disclosure of credentials, to redacted signatures, to zero knowledge proofs.
In reality, we do not have a single technical proposal where one extracts claims from credentials and packages them into a profile. I expect Evernym to submit something at some point related to this.
Yes, that is a correct definition.
Yes.
Hmm, you can have a credential with one claim that has a signature. So, in that case, does the claim have its own signature? I'm picking at the edges of your definition to see if you've thought those edges through.
I'd argue that slicing and dicing probably belongs more in a particular signature suite than it does in the main spec. I admit that this should probably be a topic of debate for the next several weeks/months.
This was always the case, but I can understand why others may have come to different conclusions since the ground has been shifting under us for a while now. My hope is that the ground is solidifying a bit and we can now focus on what you wrote above as the central message in the spec.
This is where I start to get very concerned about what is meant by "zero-knowledge proof" and "independent claims packaged into a profile". We need to be very specific about how we're achieving this from a technical perspective, as there are many scenarios where this technology just does not exist, or if it's built incorrectly, it completely violates one's privacy by just flat out not working when deployed at scale. That said, we do need to start teasing out the appropriate language and specific way we're modeling this. Digital Bazaar has a few proposals on how one can support selective disclosure w/o using ZKPs, but do admit that we can't do it in a way that's fully anonymous (signature regenerated on every transmission) (due to the RSA/Ed25519 signature mechanisms we're proposing).
+1, I think we have that right now, but clearly we're not explaining that to an acceptable degree. I've also heard the Evernym folks state that they don't think we're there yet, so this is a discussion all of us have to have.
What you say above is /mostly/ correct. One minor clarification:
Thanks for the feedback @jandrieu, that you're able to tease all of this out from the mess of prose in the current spec is impressive. :) |
I see where I missed the detail. Credentials were defined as one or more claims and then in the Verifiable Claims Model, credentials could be verified by adding a signature. I totally read that whole section as adding a signature to a claim since it was in the Verifiable Claims Model section. Yet, clearly stated both there and elsewhere, was that the signature is not added to the claim--as I expected--but added to the Credential. That was the core of the disconnect. If I understand your point, allowing each claim to have its own signature is computationally/size-wise untenable. However, it would have been the simplest slice & dice solution. Since it seemed like such a reasonable interpretation, that's how I've always thought it was intended. My mistake. FWIW, I definitely don't see ZPKs as the only way to atomize claims and I definitely agree not all claims are appropriate for atomization. Also FWIW, the Verifiable Profile example and definition is woefully inadequate. I like the scope of use and the signature by the holder, but that isnot yet present in either the prose definition or the data model example. As long as it is optional for use cases that don't need that level of assurance, we're good. I talk about this more in issue #68. Clearly the profile has quite a bit more than just a set of credentials. Glad to see where I was wrong on this one. I had totally misread that and ran off with a completely different mental model. |
I think we're going to continue to be confused about issues like this without examples. We'll need to get some examples into the spec to demonstrate what is presented in order to be sure we aren't miscommunicating. |
Yes -- and this disconnect and confusion in the spec is what I'm referring to in #66. @jandrieu -- please reread the OP for #66 and see if it makes more sense now. |
A couple of naming issues we need to address: First off we have to be very careful about the use of the term selective disclosure. It should only refer to the cryptographic methods that are used by tools like you prove and identity mixer and not for other uses. Instead, we should be using the term data minimization. Of the various signature standards that we have been talking about I personally believe the low hanging fruit is the hash tree signature method which is a form of data minimization rather than a form of selective disclosure. In fact we may want in many cases to require it for multi-claims in entity profiles. I was just looking in the w3c-dvcg repo read me and I don't see that spec proposal one listed. |
This is the one we have right now to do data minimization via redaction: https://w3c-dvcg.github.io/lds-redaction2016/ It doesn't use hash trees, but rather uses hash lists to redact specific claims. This particular signature mechanism is not resistant to tracking, as the signature on the credential doesn't change (thus providing a unique, trackable identifier), even though the minimized data transmitted does change (attributes are replaced w/ salts+hashes). |
In general, looks excellent. I had only 1 issue, which I think is covered by @jandrieu's and @msporny's discussion. Here is the verifiable profile definition:
Example 6 doesn't have a signature, which is allowed by the above definition (i.e. "typically" not always counter-signed). But the definition says it is tamper-resistant. I can see how this is achievable, i.e if this is in a content-addressable store, but we may need some more clarity about the expectations for tamper-resistance?
I'm still getting up to speed in the Verifiable Claims data model, so I apologize if this is off course. |
No, you're right, that example is bad. Perhaps I should align the examples in the spec for this PR (I was going to do it later) to match the new data model section as those examples are CRAZY old and are clearly causing problems. |
On Mon, Aug 7, 2017 at 2:58 PM, Christopher Allen ***@***.***> wrote:
A couple of naming issues we need to address:
First off we have to be very careful about the use of the term selective
disclosure. It should only refer to the cryptographic methods that are used
by tools like you prove and identity mixer and not for other uses.
Why? If I only want to disclose a subset, the term "selective disclosure"
intuitively sounds just right. The term "data minimization" sounds like we
are creating a binary serialization, a "compiled" version.
… Instead, we should be using the term data minimization.
Of the various signature standards that we have been talking about I
personally believe the low hanging fruit is the hash tree signature method
which is a form of data minimization rather than a form of selective
disclosure. In fact we may want in many cases to require it for
multi-claims in entity profiles.
I was just looking in the w3c-dvcg repo read me and I don't see that spec
proposal one listed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#67 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABHeU6LjvEOAP66ol9thi8P-dmnTY6sbks5sV15IgaJpZM4OtBsP>
.
|
Thanks for the review all, merging. @jandrieu and @ChristopherA - could you please raise issues related to the examples and graphics and I'll try to work through those in the next PRs. |
This is an initial refactoring of the core data model section in an attempt to clarify a number of confusions related to the data model.
You can preview the changes to section 3: Core Data Model here:
http://manu.sporny.org/tmp/vc-data-model/