Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unifying Capsule and ReconstructedCapsule into a single Capsule class. #38

Merged
merged 41 commits into from
Jan 31, 2018

Conversation

jMyles
Copy link
Contributor

@jMyles jMyles commented Jan 27, 2018

What this does:

  1. Unifies both capsule concepts ("original" and "reconstructed") under one class (Capsule)
  2. Puts Capsule back in umbral.py where @cygnusv prefers it.
  3. Fixes de/serialization methods - Capsules now properly include both sets of components when necessary.
  4. Makes most attrs and methods private.
    (and also now)
  5. Moves private PRE._decapsulate_reencrypted to public PRE.decapsulate_reencyrpted
  6. Makes PRE.decrypt accept Capsules without CFrags (openable only by Alice), or with CFrags (also openable by Bob).
  7. Updates tests to reflect the above.

@jMyles jMyles changed the title [WIP] Unifying Capsule and ReconstructedCapsule into a single Capsule class. Unifying Capsule and ReconstructedCapsule into a single Capsule class. Jan 27, 2018
@jMyles jMyles requested review from tuxxy and cygnusv January 28, 2018 02:30
Copy link
Contributor

@tuxxy tuxxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good, I just want some clarification on the above commented and also @cygnusv approval.

Great work!

@@ -1,9 +1,13 @@
import pytest

from umbral import umbral, keys

import random
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you not using random anywhere? Can't seem to find it used here in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, yeah - at first I had forgotten that gen_rand existed on Point and BigNum so I was going to whip together one of my own. Good catch.

umbral/umbral.py Outdated
def reconstruct(self):
def open(self, pub_bob, priv_bob, pub_alice, force_reopen=False, pre=None):
# TODO: Raise an error here if Bob has gathered enough cFrags? See #22.
if self.contents and not force_reopen:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the point of the force_reopen logic here. What functionality is it adding or what problem is it solving?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can probably nix it.

My thinking here was that, if we don't provide a place to do it, then users will invent a workaround using the private methods / attrs. However, if we expressly don't want people to be able to do this, then yeah, let's kill it.

umbral/umbral.py Outdated
self._reconstruct()
if not pre:
pre = PRE(UmbralParameters())
self._contents = pre._decapsulate_reencrypted(pub_bob, priv_bob, pub_alice, self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to store the decrypted key here. Ideally, it isn't stored except when it's needed due to how Python handles memory.

So, if we store it in the object and the object is just sorta kept around for any reason, so is the key. I like having the user declared variable be the only place where the key is stored. Not sure if @cygnusv feels the same way, though.

Copy link
Contributor Author

@jMyles jMyles Jan 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, it isn't stored except when it's needed due to how Python handles memory.

To which part of python memory handling are you referring, specifically?

So, if we store it in the object and the object is just sorta kept around for any reason, so is the key. I like having the user declared variable be the only place where the key is stored.

For most attacks I can think of, this is a false sense of security and not a genuine security measure.

You don't really know when garbage collection is occurring in either case, although yes, it's probably true in most cases that a class instance with a attribute will persist as a reference longer than a simple name.

You'll find that "accept it and move on" is a popular attitude on this topic, and that even solutions which purport to scrub object in memory rely on ctypes, are implementation specific, and still end up being vulnerable to an attacker who can read process memory.

However, if this vector is your concern - and thus your threat model includes access to process memory - then we need a wholesale re-evaluation of a lot of the logic of this project. I think that, for now, we need proceed to launch with an understanding that we do not provide robustness against a compromised system.

I understand that in this age of VPS's and Spectres that it's tempting to make something that is perfectly secure against the apparent power of an attacker who can work your own hardware against you, but I think that's outside the scope of this project. Especially because this is a reference implementation, I think it's sensible to avoid performing gymnastics against more generalized threats than are specific to the tooling we're providing.

I do, however, like what I understand as your general thought process on the matter - for example, having a separate KeyStore process, with narrower entry and exit points, and having Bob store a Capsule with m-1 CFrags attached.

So, first: let's work through which part of python memory handling concerns you - this is an area over which I have a reasonably thorough understanding. Which aspect of the way python handles memory do you want to discuss?

Some resources that might help you think about this:

Copy link
Contributor

@tuxxy tuxxy Jan 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not speaking on Python specifically. It's just generally considered a good idea to keep plaintext key material out of memory as long as possible.

Considering that the reconstructed capsule is there, it should be possible for the user to provide their private key whenever they need to actually access it. This is already a "secure cache" (sorta) because the key will not be stored in the clear.

This keeps the plaintext key out of the memory as much as possible and leaves it up to the user to maintain that themselves. My ideal method would be to not cache the key in plaintext, but to cache it as ciphertext -- if we have to.

Since that is a different engineering challenge and out of the scope of Umbral itself, I don't think it should be implemented now or even here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suppose that makes sense.

umbral/umbral.py Outdated
h = hash_to_bn([e, v], params)

return s * params.g == v + (h * e)

def attach_cfrag(self, cfrag: CapsuleFrag):
self.cfrags[cfrag.bn_kfrag_id] = cfrag

def reconstruct(self):
def open(self, pub_bob, priv_bob, pub_alice, force_reopen=False, pre=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to provide such a method? I like the name (open), but not sure if I like what's happening here.

I do prefer using the simple API to do these sorts of things now and would prefer to keep functions that handle these kinds of things there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a good question. Ideally, the "simple API" calls the one-and-only public read method on this class, right? I was trying to make open that function. Then additionally, I also like the idea of caching the result so that the API doesn't have to concern itself with that.

So the question is: is this the one obvious way to do this? Or is there a better one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if we provide an open does that mean there should be a close?

Because we can't reliably provide a closing method to clear memory and handle all the sensitive components in Python. My thinking is that we should just urge the user to use the API as provided. If we see evidence of users doing otherwise in the future, maybe make an adjustment then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are thinking of removing this and just promoting _decapsulate_reencryoted to public?

I'm not entirely opposed to that, but there are a couple of ways in which I think this approach is superior:

  1. it allows the user (or perhaps us, in a future API revision) to separate the decapsulation from accessing the contents. This seems like kind of a big deal to me, because they may want to fail fast on the decapsulation, but not access the contents until the end of the loop or whatever.
  2. it keeps the mechanics of decapsulation in a private layer, while putting the access in a public function. I think this is just good engineering generally.

As for the "close" thing: that occurred to me too, but I actually think that it serves to obfuscate, rather than clarify, the metaphor. Once you've dug up a time capsule and removed the contents, do you expect the capsule to provide a facility for closing it again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like the idea of using decapsulate_reencrypted as a public method.

As for your first point, decapsulation is the access of the contents. We should not confuse this to the user who might think, "Oh, I haven't decrypted it, I have just decapsulated it."

For your second, it is a good point, but I don't think hiding decapsulating from the user is going to do us any good. Users are already comfortable with decryption, decapsulating is pretty similar and I don't think is too hard to grasp the concept of.

Again, I really want to stress to users that decapsulating is decryption and they should use this process similarly. Accessing the key material is a big deal like that.

Copy link
Contributor Author

@jMyles jMyles Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is all making sense to me.

Here's what is still weird:

  • Why do we have both PRE._decapsulate_reencrypted and PRE.decrypt_reencrypted ?

I understand that the former just opens the capsule and peers inside, while the latter also decrypts whatever is inside, but it feels like a mixed metaphor to me.

I think it makes sense to:

  • Move the latter to Capsule
  • Rename it to something more straightforward, like, get_contents
  • Rename PRE._decapsulate_reencrypted to PRE.decapsulate_reencrypted

Then we have a clean, simple API in PRE, and a good clean metaphor for Capsule.

How's that sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to see what that looks like, I just pushed it in f3be38b. For my money, I think it's getting pretty close to being a go.

umbral/umbral.py Outdated


class ReconstructedCapsule(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Copy link
Member

@cygnusv cygnusv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, great work! My only issue is with the Capsule.decrypt method

raise ValueError(
"Can't make a Capsule from nothing. Pass either Alice's data (ie, point_eph_e) or Bob's (e_prime). \
Passing both is also fine.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bn_sig is also necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so you want the test to be for all three of either the original components or reconstructed components?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's perfect :)

umbral/umbral.py Outdated
Reconstructs the Capsule contents from the attached CFrags,
opens the Capsule and returns what is inside.

This will often by a symmetric key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"by" --> "be"

umbral/umbral.py Outdated

def decrypt(self, recp_priv_key: UmbralPrivateKey,
sender_pub_key: UmbralPublicKey,
ciphertext: bytes, pre):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of "sender" is not right here. Anyone that knows the public key could be a potential sender, without being possible to know who he/she was. What we need here is the public key of the original owner (i.e., Alice). I'd propose something like original_pub_key, but since we seem to have a convention with names, it could be alice_pub_key .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I absolutely agree. I added "naming conventions" to homecoming discussions. There are a lot of names all over this project that drive me crazy.

Rather than try to fix them all (I did fix a few), I was trying to get this PR with the current names and then fix them later. But maybe it does make sense to fix these now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. As long as we do it at some moment, it's OK.
I personally liked the "Alice" name. It's consistent with PRE literature, where re-encryption is usually illustrated from Alice to Bob. Other more technical names (e.g., owner, delegator, etc.) are much uglier, IMHO.

umbral/umbral.py Outdated
Opens the capsule and gets what's inside.

We hope that's a symmetric key, which we use to decrypt the ciphertext
and return the resulting cleartext.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it seems that now the decrypt operation is in the Capsule, and this class is used for both original and reconstructed capsules, then this method should work for both cases. Right now it's really strange that if you want to decrypt a re-encrypted ciphertext you go through Capsule.decrypt, but to decrypt an original ciphertext you use pre.decrypt.

Having said that, I really don't like that the decrypt method is offered by the capsule, it's really weird for several reasons, but at this point, I don't want to halt development for an issue like that. If @michwill and @tuxxy are OK with this, then it's fine by me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add something to the last paragraph: note the asymmetry between encrypting through the PRE.encrypt interface, but decrypting through Capsule.decrypt. Of course we can do it like that, but boy it's weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm somewhat in agreement here.

But understand: it was the original asymmetry between PRE.encrypt and PRE.decrypt which led me to do this. Notice that, if you go all the way through the lifecycle of PRE'd content using the API, and try to use PRE.decrypt at the very end, you'll get "Generic Umbral Error," because PRE.decrypt expects Alice's private key, not Bob's.

In fact, for my part, I find PRE.encrypt more suspect than Capsule.decrypt. If I had to choose, I'd opt for moving encrypt to Capsule and making it a classmethod.

If we do decide to move Capsule.decrypt back to PRE, perhaps it's a good idea to combine it with the current decrypt method; making it so that it can take either a capsule with original components or one with reconstructed components.

However, we have the odd situation where we sometimes have both - in fact, so far, you've suggested that we'll always have both (ie, #37).

But I don't really understand how that works with the from_bytes method - if you pass the bytestring with reconstruct=True (just as previously with ReconstructedCapsule.from_bytes), you end up with a Capsule with only reconstructed components. What then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do decide to move Capsule.decrypt back to PRE, perhaps it's a good idea to combine it with the current decrypt method; making it so that it can take either a capsule with original components or one with reconstructed components.

I think that's completely fine. But as you mention in your last paragraph, that only works when either you decrypt an original capsule, or a reconstructed capsule with the original components. I think we should only allow these 2 options. The other possibility (i.e., reconstructed capsule without original components) doesn't allow for validity checking, and with the small change I have in mind, decryption wouldn't even work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so does that mean we need to rethink serialization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, didn't think about that. Yes, I think we should. Previous approach to serialization kind of assumed that we'd have two different objects: a capsule and a reconstructed capsule. With the current approach, which is only one object that it's "enriched" with cFrags until it can "reconstruct" the additional elements (note: I'm starting to think that "reconstruct" it's not a good metaphor here, but I don't want to open that door again!), then obviously serialization/deserialization should handle the original and new components.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, "reconstruct" is starting to wither, fortunately.

@michwill had a discussion yesterday about this: I'm now picturing a Capsule with a bunch of little slots, and when you plug enough CFrags into them, it kind of glows an orange-ish color and a little pedestal rises up, with a little fingerprint scanner on it for Bob to use (ie, transmit his private key).

How about we start to try the word "activated" instead of "reconstructed"?

So, we now want to serialize all six components (three original, three activated) right? Do you have any opinions about how to do that or shall I just give it a crack?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we start to try the word "activated" instead of "reconstructed"?

That could work, although for the moment, and given that we are currently used to "reconstruct", I suggest to stick to it for a while, and change it when we agree on new naming conventions (which it seems you will discuss at the homecoming). I simply don't want us to get distracted with semantic discussions right now.

@michwill had a discussion yesterday about this: I'm now picturing a Capsule with a bunch of little slots, and when you plug enough CFrags into them, it kind of glows an orange-ish color and a little pedestal rises up, with a little fingerprint scanner on it for Bob to use (ie, transmit his private key).

That's a cool idea for a NuCypher KMS video :-)

@tuxxy
Copy link
Contributor

tuxxy commented Jan 30, 2018

What's the status on this? Some inconsistencies were pointed out by @cygnusv. Mostly dealing with the Capsule.decrypt method.

He is bringing up some of the points that I brought up as well, but I didn't even think about the asymmetry between PRE.encrypt and Capsule.decrypt.

How shall we proceed?

@jMyles
Copy link
Contributor Author

jMyles commented Jan 30, 2018

@tuxxy: I've been working on it this morning. Stand by.

Copy link
Contributor

@tuxxy tuxxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐧

capsule: Capsule,
opener_private_key: UmbralPrivateKey,
capsule_maker_pub_key: UmbralPublicKey,
pre):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opener_private_key is OK, but capsule_maker_pub_key it's not, since in principle anyone (even if they don't have any keys) can be a capsule maker. Anyway, this is just a comment on naming. Let's simply put it in our TODO list.

Copy link
Member

@cygnusv cygnusv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@tuxxy tuxxy merged commit 6bd90f4 into nucypher:master Jan 31, 2018
@tuxxy tuxxy mentioned this pull request Feb 1, 2018
@fjarri fjarri mentioned this pull request Sep 11, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants