|
| 1 | +Private Data |
| 2 | +============ |
| 3 | + |
| 4 | +.. note:: This topic assumes an understand of the conceptual material in the |
| 5 | + `documentation on private data <private-data.html>`_. |
| 6 | + |
| 7 | +Private data collection definition |
| 8 | +---------------------------------- |
| 9 | + |
| 10 | +A collection definition contains one or more collections, each having a policy |
| 11 | +definition listing the organizations in the collection, as well as properties |
| 12 | +used to control endorsement and, optionally, whether the data will be purged. |
| 13 | + |
| 14 | +The collection definition gets deployed to the channel at the time of chaincode |
| 15 | +instantiation. If using the peer CLI to instantiate the chaincode, the |
| 16 | +collection definition file is passed to the chaincode instantiation |
| 17 | +using the ``--collections-config`` flag. If using a client SDK, check the `SDK |
| 18 | +documentation <https://fabric-sdk-node.github.io/>`_ for information on providing the collection |
| 19 | +definition. |
| 20 | + |
| 21 | +Collection definitions are composed of five properties: |
| 22 | + |
| 23 | +* ``name``: Name of the collection. |
| 24 | + |
| 25 | +* ``policy``: Defines the organization peers allowed to persist the collection |
| 26 | + data expressed using the ``Signature`` policy syntax, with each member being |
| 27 | + included in an ``OR`` signature policy list. |
| 28 | + |
| 29 | +* ``requiredPeerCount``: Minimum number of peers that the endorsing peer must |
| 30 | + successfully disseminate private data to before the peer signs the |
| 31 | + endorsement and returns the proposal response back to the client. When |
| 32 | + ``requiredPeerCount`` is ``0``, it means that no distribution is **required**, |
| 33 | + but there may be some distribution if ``maxPeerCount`` is greater than zero. A |
| 34 | + ``requiredPeerCount`` of ``0`` would typically not be recommended, as it could |
| 35 | + lead to loss of private data. Typically you would want to require at least some |
| 36 | + distribution of the private data at endorsement time to ensure redundancy of the |
| 37 | + private data on multiple peers in the network. |
| 38 | + |
| 39 | +* ``maxPeerCount``: For data redundancy purposes, the number of other peers |
| 40 | + that the current endorsing peer will attempt to distribute the data to. If an |
| 41 | + endorsing peer becomes unavailable between endorsement time and commit time, |
| 42 | + other peers that are collection members but who did not yet receive the private |
| 43 | + data, will be able to pull the private data from the peers the private data was |
| 44 | + disseminated to. If this value is set to ``0``, the private data is not |
| 45 | + disseminated at endorsement time, forcing private data pulls on all authorized |
| 46 | + peers. |
| 47 | + |
| 48 | +* ``blockToLive``: Represents how long the data should live on the private |
| 49 | + database in terms of blocks. The data will live for this specified number of |
| 50 | + blocks on the private database and after that it will get purged, making this |
| 51 | + data obsolete from the network. To keep private data indefinitely, that is, to |
| 52 | + never purge private data, set the ``blockToLive`` property to ``0``. |
| 53 | + |
| 54 | +Here is a sample collection definition JSON file, containing an array of two |
| 55 | +collection definitions: |
| 56 | + |
| 57 | +.. code:: bash |
| 58 | +
|
| 59 | + [ |
| 60 | + { |
| 61 | + "name": "collectionMarbles", |
| 62 | + "policy": "OR('Org1MSP.member', 'Org2MSP.member')", |
| 63 | + "requiredPeerCount": 0, |
| 64 | + "maxPeerCount": 3, |
| 65 | + "blockToLive":1000000 |
| 66 | + }, |
| 67 | + { |
| 68 | + "name": "collectionMarblePrivateDetails", |
| 69 | + "policy": "OR('Org1MSP.member')", |
| 70 | + "requiredPeerCount": 0, |
| 71 | + "maxPeerCount": 3, |
| 72 | + "blockToLive":3 |
| 73 | + } |
| 74 | + ] |
| 75 | +
|
| 76 | +This example uses the organizations from the BYFN sample network, ``Org1`` and |
| 77 | +``Org2`` . The policy in the ``collectionMarbles`` definition authorizes both |
| 78 | +organizations to the private data. This is a typical configuration when the |
| 79 | +chaincode data needs to remain private from the ordering service nodes. However, |
| 80 | +the policy in the ``collectionMarblePrivateDetails`` definition restricts access |
| 81 | +to a subset of organizations in the channel (in this case ``Org1`` ). In a real |
| 82 | +scenario, there would be many organizations in the channel, with two or more |
| 83 | +organizations in each collection sharing private data between them. |
| 84 | + |
| 85 | +How private data is committed |
| 86 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 87 | + |
| 88 | +When authorized peers do not have a copy of the private data in their transient |
| 89 | +data store they will attempt to pull the private data from another authorized |
| 90 | +peer, *for a configurable amount of time* based on the peer property |
| 91 | +``peer.gossip.pvtData.pullRetryThreshold`` in the peer configuration ``core.yaml`` |
| 92 | +file. |
| 93 | + |
| 94 | +.. note:: The peers being asked for private data will only return the private data |
| 95 | + if the requesting peer is a member of the collection as defined by the |
| 96 | + policy. |
| 97 | + |
| 98 | +Considerations when using ``pullRetryThreshold``: |
| 99 | + |
| 100 | +* If the requesting peer is able to retrieve the private data within the |
| 101 | + ``pullRetryThreshold``, it will commit the transaction to its ledger |
| 102 | + (including the private data hash), and store the private data in its |
| 103 | + state database, logically separated from other channel state data. |
| 104 | + |
| 105 | +* If the requesting peer is not able to retrieve the private data within |
| 106 | + the ``pullRetryThreshold``, it will commit the transaction to it’s blockchain |
| 107 | + (including the private data hash), without the private data. |
| 108 | + |
| 109 | +* If the peer was entitled to the private data but it is missing, then |
| 110 | + that the peer will not be able to endorse future transactions that reference |
| 111 | + the missing private data - a chaincode query for a key that is missing will |
| 112 | + be detected (based on the presence of the key’s hash in the state database), |
| 113 | + and the chaincode will receive an error. |
| 114 | + |
| 115 | +Therefore, it is important to set the ``requiredPeerCount`` and ``maxPeerCount`` |
| 116 | +properties large enough to ensure the availability of private data in your |
| 117 | +channel. For example, if each of the endorsing peers become unavailable |
| 118 | +before the transaction commits, the ``requiredPeerCount`` and ``maxPeerCount`` |
| 119 | +properties will have ensured the private data is available on other peers. |
| 120 | + |
| 121 | +.. note:: For collections to work, it is important to have cross organizational |
| 122 | + gossip configured correctly. Refer to our documentation on :doc:`gossip`, |
| 123 | + paying particular attention to the section on "anchor peers". |
| 124 | + |
| 125 | +Endorsement |
| 126 | +~~~~~~~~~~~ |
| 127 | + |
| 128 | +The endorsing peer plays an important role in disseminating private data to |
| 129 | +other authorized peers, ensuring the availability of private data on the |
| 130 | +channel. To assist with this dissemination, the ``maxPeerCount`` and |
| 131 | +``requiredPeerCount`` properties in the collection definition control the |
| 132 | +dissemination behavior. |
| 133 | + |
| 134 | +If the endorsing peer cannot successfully disseminate the private data to at least |
| 135 | +the ``requiredPeerCount``, it will return an error back to the client. The endorsing |
| 136 | +peer will attempt to disseminate the private data to peers of different organizations, |
| 137 | +in an effort to ensure that each authorized organization has a copy of the private |
| 138 | +data. Since transactions are not committed at chaincode execution time, the endorsing |
| 139 | +peer and recipient peers store a copy of the private data in a local ``transient store`` |
| 140 | +alongside their blockchain until the transaction is committed. |
| 141 | + |
| 142 | +Referencing collections from chaincode |
| 143 | +-------------------------------------- |
| 144 | + |
| 145 | +A set of `shim APIs <https://godoc.org/github.com/hyperledger/fabric/core/chaincode/shim>`_ |
| 146 | +are available for setting and retrieving private data. |
| 147 | + |
| 148 | +The same chaincode data operations can be applied to channel state data and |
| 149 | +private data, but in the case of private data, a collection name is specified |
| 150 | +along with the data in the chaincode APIs, for example |
| 151 | +``PutPrivateData(collection,key,value)`` and ``GetPrivateData(collection,key)``. |
| 152 | + |
| 153 | +A single chaincode can reference multiple collections. |
| 154 | + |
| 155 | +How to pass private data in a chaincode proposal |
| 156 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 157 | + |
| 158 | +Since the chaincode proposal gets stored on the blockchain, it is also important |
| 159 | +not to include private data in the main part of the chaincode proposal. A special |
| 160 | +field in the chaincode proposal called the ``transient`` field can be used to pass |
| 161 | +private data from the client (or data that chaincode will use to generate private |
| 162 | +data), to chaincode invocation on the peer. The chaincode can retrieve the |
| 163 | +``transient`` field by calling the ```GetTransient()`` API <https://github.com/hyperledger/fabric/blob/13447bf5ead693f07285ce63a1903c5d0d25f096/core/chaincode/shim/interfaces_stable.go>`_. |
| 164 | +This ``transient`` field gets excluded from the channel transaction. |
| 165 | + |
| 166 | +Considerations when using private data |
| 167 | +-------------------------------------- |
| 168 | + |
| 169 | +Querying Private Data |
| 170 | +~~~~~~~~~~~~~~~~~~~~~ |
| 171 | + |
| 172 | +Private collection data can be queried just like normal channel data, using |
| 173 | +shim APIs: |
| 174 | + |
| 175 | +* ``GetPrivateDataByRange(collection, startKey, endKey string)`` |
| 176 | +* ``GetPrivateDataByPartialCompositeKey(collection, objectType string, keys []string)`` |
| 177 | + |
| 178 | +And for the CouchDB state database, JSON content queries can be passed using the |
| 179 | +shim API: |
| 180 | + |
| 181 | +* ``GetPrivateDataQueryResult(collection, query string)`` |
| 182 | + |
| 183 | +Limitations: |
| 184 | + |
| 185 | +* Clients that call chaincode that executes queries should be aware that they |
| 186 | + may receive a subset of the result set, if the peer they query has missing |
| 187 | + private data, based on the explanation in Private Data Dissemination section |
| 188 | + above. Clients can query multiple peers and compare the results to |
| 189 | + determine if a peer may be missing some of the result set. |
| 190 | +* Chaincode that executes queries and updates data in a single transaction |
| 191 | + is not supported, as the query results cannot be validated on the peers |
| 192 | + that don’t have access to the private data, or on peers that are missing the |
| 193 | + private data that they have access to. If a chaincode invocation both queries |
| 194 | + and updates private data, the proposal request will return an error. |
| 195 | +* Note that private data collections only define which organization’s peers |
| 196 | + are authorized to receive and store private data, and consequently implies |
| 197 | + which peers can be used to query private data. Private data collections do not |
| 198 | + by themselves limit access control within chaincode. For example if |
| 199 | + non-authorized clients are able to invoke chaincode on peers that have access |
| 200 | + to the private data, the chaincode logic still needs a means to enforce access |
| 201 | + control as usual, for example by calling the GetCreator() chaincode API or |
| 202 | + using the client identity `chaincode library <https://github.com/hyperledger/fabric/tree/master/core/chaincode/lib/cid>`__ . |
| 203 | + |
| 204 | +Using Indexes with collections |
| 205 | +------------------------------ |
| 206 | + |
| 207 | +The topic :doc:`couchdb_as_state_database` describes indexes that can be |
| 208 | +applied to the channel’s state database to enable JSON content queries, by |
| 209 | +packaging indexes in a ``META-INF/statedb/couchdb/indexes`` directory at chaincode |
| 210 | +installation time. Similarly, indexes can also be applied to private data |
| 211 | +collections, by packaging indexes in a ``META-INF/statedb/couchdb/collections/<collection_name>/indexes`` |
| 212 | +directory. An example index is available `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02_private/go/META-INF/statedb/couchdb/collections/collectionMarbles/indexes/indexOwner.json>`_. |
| 213 | + |
| 214 | +Private Data Purging |
| 215 | +~~~~~~~~~~~~~~~~~~~~ |
| 216 | + |
| 217 | +To keep private data indefinitely, that is, to never purge private data, |
| 218 | +set ``blockToLive`` property to ``0``. |
| 219 | + |
| 220 | +Recall that prior to commit, peers store private data in a local |
| 221 | +transient data store. This data automatically gets purged when the transaction |
| 222 | +commits. But if a transaction was never submitted to the channel and |
| 223 | +therefore never committed, the private data would remain in each peer’s |
| 224 | +transient store. This data is purged from the transient store after a |
| 225 | +configurable number blocks by using the peer’s |
| 226 | +``peer.gossip.pvtData.transientstoreMaxBlockRetention`` property in the peer |
| 227 | +``core.yaml`` file. |
| 228 | + |
| 229 | +Upgrading a collection definition |
| 230 | +--------------------------------- |
| 231 | + |
| 232 | +If a collection is referenced by a chaincode, the chaincode will use the prior |
| 233 | +collection definition unless a new collection definition is specified at upgrade |
| 234 | +time. If a collection configuration is specified during the upgrade, a definition |
| 235 | +for each of the existing collections must be included, and you can add new |
| 236 | +collection definitions. |
| 237 | + |
| 238 | +Collection updates becomes effective when a peer commits the block that |
| 239 | +contains the chaincode upgrade transaction. Note that collections cannot be |
| 240 | +deleted, as there may be prior private data hashes on the channel’s blockchain |
| 241 | +that cannot be removed. |
| 242 | + |
| 243 | +.. Licensed under Creative Commons Attribution 4.0 International License |
| 244 | + https://creativecommons.org/licenses/by/4.0/ |
0 commit comments