@@ -24,8 +24,8 @@ These changes have two subcomponents:
24
24
25
25
* Changes to the currently unstandardized PyPI upload API, allowing clients
26
26
to upload digital attestations as :ref: `attestation objects <attestation-object >`;
27
- * Changes to the :pep: ` 503 ` and :pep: ` 691 ` "simple" APIs, allowing clients
28
- to retrieve both digital attestations and
27
+ * Changes to the :ref: ` HTML and JSON "simple" APIs < packaging:simple-repository-api >`,
28
+ allowing clients to retrieve both digital attestations and
29
29
`Trusted Publishing <https://docs.pypi.org/trusted-publishers/ >`_ metadata
30
30
for individual release files as :ref: `provenance objects <provenance-object >`.
31
31
@@ -75,7 +75,7 @@ Additionally, this proposal identifies the following motivations:
75
75
of the metadata needed by the index to verify an attestation's validity.
76
76
77
77
This PEP proposes a generic attestation format, containing an
78
- :ref: `attestation payload for signature generation <payload-and-signature-generation >`,
78
+ :ref: `attestation statement for signature generation <payload-and-signature-generation >`,
79
79
with the expectation that index providers adopt the
80
80
format with a suitable source of identity for signature verification, such as
81
81
Trusted Publishing.
@@ -116,8 +116,9 @@ areas of Python packaging:
116
116
metadata within the cryptographic envelope.
117
117
118
118
For example, to prevent domain separation between a distribution's name and
119
- its contents, this PEP proposes that digital attestations be performed over
120
- ``HASH(name || HASH(contents)) `` rather than just ``HASH(contents) ``.
119
+ its contents, this PEP uses '`Statements <https://github.com/in-toto/attestation/blob/v1.0/spec/v1.0/statement.md >`__'
120
+ from the `in-toto project <https://in-toto.io/ >`__ to bind the distribution's
121
+ contents (via SHA-256 digest) to its filename.
121
122
122
123
123
124
Previous Work
@@ -196,6 +197,9 @@ Index changes
196
197
Simple Index
197
198
^^^^^^^^^^^^
198
199
200
+ The following changes are made to the
201
+ :ref: `simple repository API <packaging:simple-repository-api-base >`:
202
+
199
203
* When an uploaded file has one or more attestations, the index **MAY **
200
204
provide a ``.provenance `` file adjacent to the hosted distribution.
201
205
The format of the ``.provenance `` file **SHALL ** be a JSON-encoded
@@ -208,32 +212,34 @@ Simple Index
208
212
209
213
* When a ``.provenance `` file is present, the index **MAY ** include a
210
214
``data-provenance `` attribute on its file link. The value of the
211
- ``data-provenance `` attribute **SHALL ** be the SHA256 digest of the
215
+ ``data-provenance `` attribute **SHALL ** be the SHA-256 digest of the
212
216
associated ``.provenance `` file.
213
217
214
218
* The index **MAY ** choose to modify the ``.provenance `` file. For example,
215
219
the index **MAY ** permit adding additional attestations and verification
216
220
materials, such as attestations from third-party auditors or other services.
217
221
When the index modifies the ``.provenance `` file, it **MUST ** also update the
218
- ``data-provenance `` attribute's value to the new SHA256 digest.
222
+ ``data-provenance `` attribute's value to the new SHA-256 digest.
219
223
220
224
See :ref: `changes-to-provenance-objects ` for an additional discussion of
221
225
reasons why a file's provenance may change.
222
226
223
227
JSON-based Simple API
224
228
^^^^^^^^^^^^^^^^^^^^^
225
229
230
+ The following changes are made to the
231
+ :ref: `JSON simple API <packaging:simple-repository-api-json >`:
232
+
226
233
* When an uploaded file has one or more attestations, the index **MAY **
227
- include a ``provenance `` object in the ``file `` dictionary for that file.
228
- The format of the ``provenance `` object **SHALL ** be a JSON-encoded
229
- :ref: `provenance object <provenance-object >`, which **SHALL ** contain
230
- the file's attestations.
234
+ include a ``provenance `` key in the ``file `` dictionary for that file.
231
235
232
- * The index **MAY ** choose to modify the ``provenance `` object, under the same
233
- conditions as the ``.provenance `` file specified above.
236
+ The value of the ``provenance `` key **SHALL ** be a JSON string, which
237
+ **SHALL ** be the SHA-256 digest of the associated ``.provenance `` file,
238
+ as in the Simple Index.
234
239
235
- See :ref: `changes-to-provenance-objects ` for an additional discussion of
236
- reasons why a file's provenance may change.
240
+ See :ref: `appendix-3 ` for an explanation of the technical decision to
241
+ embed the SHA-256 digest in the JSON API, rather than the full
242
+ :ref: `provenance object <provenance-object >`.
237
243
238
244
These changes require a version change to the JSON API:
239
245
@@ -260,13 +266,28 @@ object is provided as pseudocode below.
260
266
261
267
verification_material: VerificationMaterial
262
268
"""
263
- Cryptographic materials used to verify `message_signature`.
269
+ Cryptographic materials used to verify `envelope`.
270
+ """
271
+
272
+ envelope: Envelope
273
+ """
274
+ The enveloped attestation statement and signature.
275
+ """
276
+
277
+
278
+ @dataclass
279
+ class Envelope :
280
+ statement: bytes
281
+ """
282
+ The attestation statement.
283
+
284
+ This is represented as opaque bytes on the wire (encoded as base64),
285
+ but it MUST be an JSON in-toto v1 Statement.
264
286
"""
265
287
266
- message_signature: str
288
+ signature: bytes
267
289
"""
268
- The attestation's signature, as `base64(raw-sig)`, where `raw-sig`
269
- is the raw bytes of the signing operation over the attestation payload.
290
+ A signature for the above statement, encoded as base64.
270
291
"""
271
292
272
293
@dataclass
@@ -302,63 +323,36 @@ object) by selecting a new version number.
302
323
303
324
.. _payload-and-signature-generation :
304
325
305
- Attestation payload and signature generation
306
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
307
-
308
- The *attestation payload * is the actual claim that is cryptographically signed
309
- over within the attestation object (as the ``message_signature ``).
310
-
311
- The attestation payload is encoded as an :rfc: `8785 ` canonicalized JSON object,
312
- with the following pseudocode layout:
313
-
314
- .. code-block :: python
315
-
316
- @dataclass
317
- class AttestationPayload :
318
- distribution: str
319
- """
320
- The file name of the Python package distribution.
321
- """
322
-
323
- digest: str
324
- """
325
- The SHA-256 digest of the distribution's contents, as a hexadecimal string.
326
- """
327
-
328
- The value of ``distribution `` is the same distribution filename that appears
329
- in the :pep: `503 ` and :pep: `691 ` APIs. For example, ``distribution `` would be
330
- ``sampleproject-1.2.0-py2.py3-none-any.whl `` for the following simple index
331
- entry:
332
-
333
- .. code-block :: html
334
-
335
- <a href =" https://example.com/..." >sampleproject-1.2.0-py2.py3-none-any.whl</a ><br />
336
-
337
- In practice, this means that ``distribution `` is defined by the PyPA's
338
- living specifications for
339
- :ref: `binary distributions <packaging:binary-distribution-format >` and
340
- :ref: `source distributions <packaging:source-distribution-format >`, although
341
- non-conforming distributions may be hosted by the index.
326
+ Attestation statement and signature generation
327
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
342
328
343
- The following pseudocode demonstrates the construction of an attestation
344
- payload and its signature:
329
+ The * attestation statement * is the actual claim that is cryptographically signed
330
+ over within the attestation object (i.e., the `` envelope.statement ``).
345
331
346
- .. code-block :: python
332
+ The attestation statement is encoded as a
333
+ `v1 in-toto Statement object <https://github.com/in-toto/attestation/blob/v1.0/spec/v1.0/statement.md >`__,
334
+ in JSON form. When serialized the statement is treated as an opaque binary blob,
335
+ avoiding the need for canonicalization. An example JSON-encoded statement is
336
+ provided in :ref: `appendix-4 `.
347
337
348
- def build_payload (dist : Path) -> AttestationPayload:
349
- return AttestationPayload(
350
- distribution = dist.name,
351
- digest = sha256(dist.read_bytes()).hexdigest,
352
- )
338
+ In addition to being a v1 in-toto Statement, the attestation statement is constrained
339
+ in the following ways:
353
340
354
- attestation_payload = build_payload(" sampleproject-1.2.0-py2.py3-none-any.whl" )
341
+ * The in-toto ``subject `` **MUST ** contain only a single subject.
342
+ * ``subject[0].name `` is the distribution's filename, which **MUST ** be
343
+ a valid :ref: `source distribution <packaging:source-distribution-format >` or
344
+ :ref: `wheel distribution <packaging:binary-distribution-format >` filename.
345
+ * ``subject[0].digest `` **MUST ** contain a SHA-256 digest. Other digests
346
+ **MAY ** be present. The digests **MUST ** be represented as hexadecimal strings.
347
+ * The following ``predicateType `` values are supported:
355
348
356
- # canonical_json is a fictitious module that performs RFC 8785 canonical
357
- # JSON serialization.
358
- encoded_payload = canonical_json.dumps(asdict(attestation_payload))
349
+ * `SLSA Provenance <https://slsa.dev/provenance/v1 >`__: ``https://slsa.dev/provenance/v1 ``
350
+ * `PyPI Publish Attestation <https://docs.pypi.org/attestations/publish/v1 >`__: ``https://docs.pypi.org/attestations/publish/v1 ``
359
351
360
- raw_signature = signing_key.sign(encoded_payload, ECDSA(SHA2_256()))
361
- message_signature = b64encode(raw_signature)
352
+ The signature over this statement is constructed using the
353
+ `v1 DSSE signature protocol <https://github.com/secure-systems-lab/dsse/blob/v1.0.0/protocol.md >`__,
354
+ with a ``PAYLOAD_TYPE `` of ``application/vnd.in-toto+json `` and a ``PAYLOAD_BODY `` of the JSON-encoded
355
+ statement above. No other ``PAYLOAD_TYPE `` is permitted.
362
356
363
357
.. _provenance-object :
364
358
@@ -368,9 +362,8 @@ Provenance objects
368
362
The index will serve uploaded attestations along with metadata that can assist
369
363
in verifying them in the form of JSON serialized objects.
370
364
371
- These *provenance objects * will be available via both the :pep: `503 ` Simple Index
372
- and :pep: `691 ` JSON-based Simple API as described above, and will have the
373
- following layout:
365
+ These *provenance objects * will be available via both the Simple Index
366
+ and JSON-based Simple API as described above, and will have the following layout:
374
367
375
368
.. code-block :: json
376
369
@@ -488,7 +481,8 @@ for changes to the provenance object include but are not limited to:
488
481
Attestation verification
489
482
------------------------
490
483
491
- Verifying an attestation object requires verification of each of the following:
484
+ Verifying an attestation object against a distribution file requires verification of each of the
485
+ following:
492
486
493
487
* ``version `` is ``1 ``. The verifier **MUST ** reject any other version.
494
488
* ``verification_material.certificate `` is a valid signing certificate, as
@@ -497,9 +491,15 @@ Verifying an attestation object requires verification of each of the following:
497
491
* ``verification_material.certificate `` identifies an appropriate signing
498
492
subject, such as the machine identity of the Trusted Publisher that published
499
493
the package.
500
- * ``message_signature `` can be verified by ``verification_material.certificate ``,
501
- using the reconstructed attestation payload as the cleartext input. The
502
- verifier **MUST ** reconstruct the attestation payload itself.
494
+ * ``envelope.statement `` is a valid in-toto v1 Statement, with a subject
495
+ and digest that **MUST ** match the distribution's filename and contents.
496
+ For the distribution's filename, matching **MUST ** be performed by parsing
497
+ using the appropriate source distribution or wheel filename format, as
498
+ the statement's subject may be equivalent but normalized.
499
+ * ``envelope.signature `` is a valid signature for ``envelope.statement ``
500
+ corresponding to ``verification_material.certificate ``,
501
+ as reconstituted via the
502
+ `v1 DSSE signature protocol <https://github.com/secure-systems-lab/dsse/blob/v1.0.0/protocol.md >`__.
503
503
504
504
In addition to the above required steps, a verifier **MAY ** additionally verify
505
505
``verification_material.transparency_entries `` on a policy basis, e.g. requiring
@@ -543,19 +543,6 @@ unstated presumption with earlier mechanisms, like PGP and wheel signatures.
543
543
This PEP does not preclude or exclude future index trust mechanisms, such
544
544
as :pep: `458 ` and/or :pep: `480 `.
545
545
546
- Flexible attestations
547
- ---------------------
548
-
549
- This PEP specifies a fixed attestation payload (defined in
550
- :ref: `payload-and-signature-generation `), binding the contents of each uploaded
551
- file to its public name on the index. This payload format is fixed and
552
- inflexible to ease implementation, and to minimize additional mechanical
553
- changes to the index itself (e.g., needing to store and service detached
554
- attestation documents).
555
-
556
- This PEP does not preclude or exclude future more flexible attestation payload
557
- formats, such as ones built on `in-toto <https://in-toto.io/ >`__.
558
-
559
546
Recommendations
560
547
===============
561
548
@@ -628,7 +615,7 @@ of signed inclusion time, and can be verified either online or offline.
628
615
629
616
inclusion_proof: InclusionProof
630
617
"""
631
- The actual inclusion proof the the log entry.
618
+ The actual inclusion proof of the log entry.
632
619
"""
633
620
634
621
@@ -668,6 +655,58 @@ of signed inclusion time, and can be verified either online or offline.
668
655
Cosigned checkpoints from zero or more log witnesses.
669
656
"""
670
657
658
+ .. _appendix-3 :
659
+
660
+ Appendix 3: Simple JSON API size considerations
661
+ ===============================================
662
+
663
+ A previous draft of this PEP required embedding each
664
+ :ref: `provenance object <provenance-object >` directly into its appropriate part
665
+ of the JSON Simple API.
666
+
667
+ The current version of this PEP embeds the SHA-256 digest of the provenance
668
+ object instead. This is done for size and network bandwidth consideration
669
+ reasons:
670
+
671
+ 1. We estimate the typical size of an attestation object to be approximately
672
+ 5.3 KB of JSON.
673
+ 2. We conservatively estimate that indices eventually host around 3 attestations
674
+ per release file, or approximately 15.9 KB of JSON per combined provenance
675
+ object.
676
+ 3. As of May 2024, the average project on PyPI has approximately 21 release
677
+ files. We conservatively expect this average to increase over time.
678
+ 4. Combined, these numbers imply that a typical project might expect to host
679
+ between 60 and 70 attestations, or approximately 339 KB of additional JSON
680
+ in its "project detail" endpoint.
681
+
682
+ These numbers are significantly worse in "pathological" cases, where projects
683
+ have hundreds or thousands of releases and/or dozens of files per release.
684
+
685
+ .. _appendix-4 :
686
+
687
+ Appendix 4: Example attestation statement
688
+ =========================================
689
+
690
+ Given a source distribution ``sampleproject-1.2.3.tar.gz `` with a SHA-256
691
+ digest of ``e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 ``,
692
+ the following is an appropriate in-toto Statement, as a JSON object:
693
+
694
+ .. code-block :: json
695
+
696
+ {
697
+ "_type" : " https://in-toto.io/Statement/v1" ,
698
+ "subject" : [
699
+ {
700
+ "name" : " sampleproject-1.2.3.tar.gz" ,
701
+ "digest" : {"sha256" : " e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" }
702
+ }
703
+ ],
704
+ "predicateType" : " https://some-arbitrary-predicate.example.com/v1" ,
705
+ "predicate" : {
706
+ "something-else" : " foo"
707
+ }
708
+ }
709
+
671
710
Copyright
672
711
=========
673
712
0 commit comments