Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP7594: Universal Verification in verify_cell_kzg_proof_batch #3812

Merged
merged 11 commits into from
Jun 28, 2024
211 changes: 178 additions & 33 deletions specs/_features/eip7594/polynomial-commitments-sampling.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
- [`_fft_field`](#_fft_field)
- [`fft_field`](#fft_field)
- [`coset_fft_field`](#coset_fft_field)
- [`compute_verify_cell_kzg_proof_batch_challenge`](#compute_verify_cell_kzg_proof_batch_challenge)
- [Polynomials in coefficient form](#polynomials-in-coefficient-form)
- [`polynomial_eval_to_coeff`](#polynomial_eval_to_coeff)
- [`add_polynomialcoeff`](#add_polynomialcoeff)
Expand All @@ -34,7 +35,9 @@
- [KZG multiproofs](#kzg-multiproofs)
- [`compute_kzg_proof_multi_impl`](#compute_kzg_proof_multi_impl)
- [`verify_kzg_proof_multi_impl`](#verify_kzg_proof_multi_impl)
- [`verify_cell_kzg_proof_batch_impl`](#verify_cell_kzg_proof_batch_impl)
- [Cell cosets](#cell-cosets)
- [`coset_shift_for_cell`](#coset_shift_for_cell)
- [`coset_for_cell`](#coset_for_cell)
- [Cells](#cells-1)
- [Cell computation](#cell-computation)
Expand Down Expand Up @@ -193,17 +196,17 @@ def coset_fft_field(vals: Sequence[BLSFieldElement],
roots_of_unity: Sequence[BLSFieldElement],
inv: bool=False) -> Sequence[BLSFieldElement]:
"""
Computes an FFT/IFFT over a coset of the roots of unity.
This is useful for when one wants to divide by a polynomial which
Computes an FFT/IFFT over a coset of the roots of unity.
This is useful for when one wants to divide by a polynomial which
vanishes on one or more elements in the domain.
"""
vals = vals.copy()

def shift_vals(vals: Sequence[BLSFieldElement], factor: BLSFieldElement) -> Sequence[BLSFieldElement]:
"""
Multiply each entry in `vals` by succeeding powers of `factor`
i.e., [vals[0] * factor^0, vals[1] * factor^1, ..., vals[n] * factor^n]
"""
"""
Multiply each entry in `vals` by succeeding powers of `factor`
i.e., [vals[0] * factor^0, vals[1] * factor^1, ..., vals[n] * factor^n]
"""
shift = 1
for i in range(len(vals)):
vals[i] = BLSFieldElement((int(vals[i]) * shift) % BLS_MODULUS)
Expand All @@ -222,6 +225,49 @@ def coset_fft_field(vals: Sequence[BLSFieldElement],
return fft_field(vals, roots_of_unity, inv)
```

#### `compute_verify_cell_kzg_proof_batch_challenge`

```python
def compute_verify_cell_kzg_proof_batch_challenge(row_commitments: Sequence[KZGCommitment],
row_indices: Sequence[RowIndex],
column_indices: Sequence[ColumnIndex],
cosets_evals: Sequence[CosetEvals],
proofs: Sequence[KZGProof]) -> BLSFieldElement:
"""
Compute a random challenge r used in the universal verification equation, see
https://ethresear.ch/t/a-universal-verification-equation-for-data-availability-sampling/13240

To compute the challenge, `RANDOM_CHALLENGE_KZG_CELL_BATCH_DOMAIN` should be used as a hash prefix
"""
# input the domain separator
hashinput = RANDOM_CHALLENGE_KZG_CELL_BATCH_DOMAIN

# input the field elements per cell
hashinput += int.to_bytes(FIELD_ELEMENTS_PER_CELL, 8, KZG_ENDIANNESS)
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved

# input the number of commitments
num_commitments = len(row_commitments)
hashinput += int.to_bytes(num_commitments, 8, KZG_ENDIANNESS)

# input the number of cells
num_cells = len(row_indices)
hashinput += int.to_bytes(num_cells, 8, KZG_ENDIANNESS)

# input all commitments
for commitment in row_commitments:
hashinput += commitment

# input each cell with its indices and proof
for k in range(num_cells):
hashinput += int.to_bytes(row_indices[k], 8, KZG_ENDIANNESS)
hashinput += int.to_bytes(column_indices[k], 8, KZG_ENDIANNESS)
for eval in cosets_evals[k]:
hashinput += bls_field_to_bytes(eval)
hashinput += proofs[k]

return hash_to_bls_field(hashinput)
```

### Polynomials in coefficient form

#### `polynomial_eval_to_coeff`
Expand Down Expand Up @@ -362,12 +408,12 @@ def compute_kzg_proof_multi_impl(
"""
Compute a KZG multi-evaluation proof for a set of `k` points.

This is done by committing to the following quotient polynomial:
This is done by committing to the following quotient polynomial:
Q(X) = f(X) - I(X) / Z(X)
Where:
- I(X) is the degree `k-1` polynomial that agrees with f(x) at all `k` points
- Z(X) is the degree `k` polynomial that evaluates to zero on all `k` points

We further note that since the degree of I(X) is less than the degree of Z(X),
the computation can be simplified in monomial form to Q(X) = f(X) / Z(X)
"""
Expand Down Expand Up @@ -401,7 +447,7 @@ def verify_kzg_proof_multi_impl(commitment: KZGCommitment,
Q(X) is the quotient polynomial computed by the prover
I(X) is the degree k-1 polynomial that evaluates to `ys` at all `zs`` points
Z(X) is the polynomial that evaluates to zero on all `k` points

The verifier receives the commitments to Q(X) and f(X), so they check the equation
holds by using the following pairing equation:
e([Q(X)]_1, [Z(X)]_2) == e([f(X)]_1 - [I(X)]_1, [1]_2)
Expand All @@ -423,8 +469,118 @@ def verify_kzg_proof_multi_impl(commitment: KZGCommitment,
]))
```

#### `verify_cell_kzg_proof_batch_impl`

```python
def verify_cell_kzg_proof_batch_impl(row_commitments: Sequence[KZGCommitment],
row_indices: Sequence[RowIndex],
column_indices: Sequence[ColumnIndex],
cosets_evals: Sequence[CosetEvals],
proofs: Sequence[KZGProof]) -> bool:
"""
Verify a set of cells, given their corresponding proofs and their coordinates (row_index, column_index) in the blob
matrix. The i-th cell is in row row_indices[i] and in column column_indices[i].
The list of all commitments is provided in row_commitments_bytes.

This function implements the universal verification equation introduced here:
https://ethresear.ch/t/a-universal-verification-equation-for-data-availability-sampling/13240
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps let's link to the relevant func instead of the blog post here. The main func should link to the blog post.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by relevant func in this context?

Sure, I will add the link to the blog post to verify_cell_kzg_proof_batch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to say that we are currently linking to the ethresearch post three times throughout the file. Perhaps we should only link once. Maybe this is the right function doc to do so, and we should remove it from the other function docs. (Sorry for the confusion, I meant to add a comment in a different function doc.)

"""

# The verification equation that we will check is pairing ( LL, LR ) = pairing (RL, [1]), where
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
# LL = sum_k r^k proofs[k],
# LR = [s^n]
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved
# RL = RLC - RLI + RLP, with
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
# RLC = sum_i (sum_{k_i} r^{k_i}) commitments[i]
# RLI = [sum_k r^k * interpolation_poly_k(s)]
# RLP = sum_k r^k * h_k^n * proofs[k]
#
# Here, the variables have the following meaning:
# - k < len(row_indices) is an index iterating over all cells in the input
# - r is a random coefficient, derived from hashing all data provided by the prover
# - s is the secret embedded in the KZG setup
# - n = FIELD_ELEMENTS_PER_CELL is the size of the evaluation domain
# - i ranges over all rows that are touched
# - k_i < len(row_indices) ranges over all cells that are in row i
# - interpolation_poly_k is the interpolation polynomial for the kth cell
# - h_k is the coset shift specifying the evaluation domain of the kth cell

# Preparation
num_cells = len(row_indices)
n = FIELD_ELEMENTS_PER_CELL
num_rows = len(row_commitments)

# Step 1: Derive powers of r, i.e., r^0, ..., r^{num_cells-1}
r = int(
compute_verify_cell_kzg_proof_batch_challenge(
row_commitments,
row_indices,
column_indices,
cosets_evals,
proofs
)
)
current_power = 1
r_powers = []
for _ in range(num_cells):
r_powers.append(current_power)
current_power = (current_power * r) % BLS_MODULUS
jtraglia marked this conversation as resolved.
Show resolved Hide resolved

# Step 2: Compute LL = sum_k r^k proofs[k]
ll = bls.bytes48_to_G1(g1_lincomb(proofs, r_powers))

# Step 3: Compute LR = [s^n]
lr = bls.bytes96_to_G2(KZG_SETUP_G2_MONOMIAL[n])

# Step 4: Compute RL = RLC - RLI + RLP
# Step 4.1: Compute RLC = sum_i (sum_{k_i} r^{k_i}) commitments[i]
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
commitment_weights = [0] * num_rows
for k in range(num_cells):
commitment_weights[row_indices[k]] = (commitment_weights[row_indices[k]] + r_powers[k]) % BLS_MODULUS
rlc = bls.bytes48_to_G1(g1_lincomb(row_commitments, commitment_weights))

# Step 4.2: Compute RLI = [sum_k r^k * interpolation_poly_k(s)]
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
# Note: an efficient implementation would use the IDFT based method explained in the blog post
sum_interp_polys_coeff = [0]
for k in range(num_cells):
interp_poly_coeff = interpolate_polynomialcoeff(coset_for_cell(column_indices[k]), cosets_evals[k])
interp_poly_scaled_coeff = multiply_polynomialcoeff([r_powers[k]], interp_poly_coeff)
sum_interp_polys_coeff = add_polynomialcoeff(sum_interp_polys_coeff, interp_poly_scaled_coeff)
rli = bls.bytes48_to_G1(g1_lincomb(KZG_SETUP_G1_MONOMIAL[:n], sum_interp_polys_coeff))

# Step 4.3: Compute RLP = sum_k r^k * h_k^n * proofs[k]
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
weighted_r_powers = []
for k in range(num_cells):
cosetshift = int(coset_shift_for_cell(column_indices[k]))
cosetshift_pow = pow(cosetshift, n, BLS_MODULUS)
wrp = (r_powers[k] * cosetshift_pow) % BLS_MODULUS
weighted_r_powers.append(wrp)
rlp = bls.bytes48_to_G1(g1_lincomb(proofs, weighted_r_powers))

# Step 4.4: Compute RL = RLC - RLI + RLP
rl = bls.add(rlc, bls.neg(rli))
rl = bls.add(rl, rlp)

# Step 5: Check pairing (LL, LR) = pairing (RL, [1])
return (bls.pairing_check([[ll, lr], [rl, bls.neg(bls.bytes96_to_G2(KZG_SETUP_G2_MONOMIAL[0])), ], ]))
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved
```


### Cell cosets

#### `coset_shift_for_cell`

```python
def coset_shift_for_cell(cell_index: CellIndex) -> BLSFieldElement:
"""
Get the shift that determines the coset for a given ``cell_index``.
asn-d6 marked this conversation as resolved.
Show resolved Hide resolved
"""
assert cell_index < CELLS_PER_EXT_BLOB
roots_of_unity_brp = bit_reversal_permutation(
compute_roots_of_unity(FIELD_ELEMENTS_PER_EXT_BLOB)
)
return roots_of_unity_brp[FIELD_ELEMENTS_PER_CELL * cell_index]
```

#### `coset_for_cell`

```python
Expand Down Expand Up @@ -457,7 +613,7 @@ def compute_cells_and_kzg_proofs(blob: Blob) -> Tuple[
Public method.
"""
assert len(blob) == BYTES_PER_BLOB

polynomial = blob_to_polynomial(blob)
polynomial_coeff = polynomial_eval_to_coeff(polynomial)

Expand Down Expand Up @@ -491,7 +647,7 @@ def verify_cell_kzg_proof(commitment_bytes: Bytes48,
assert cell_index < CELLS_PER_EXT_BLOB
assert len(cell) == BYTES_PER_CELL
assert len(proof_bytes) == BYTES_PER_PROOF

coset = coset_for_cell(cell_index)

return verify_kzg_proof_multi_impl(
Expand All @@ -511,18 +667,12 @@ def verify_cell_kzg_proof_batch(row_commitments_bytes: Sequence[Bytes48],
proofs_bytes: Sequence[Bytes48]) -> bool:
"""
Verify a set of cells, given their corresponding proofs and their coordinates (row_index, column_index) in the blob
matrix. The list of all commitments is also provided in row_commitments_bytes.

This function implements the naive algorithm of checking every cell
individually; an efficient algorithm can be found here:
https://ethresear.ch/t/a-universal-verification-equation-for-data-availability-sampling/13240

This implementation does not require randomness, but for the algorithm that
requires it, `RANDOM_CHALLENGE_KZG_CELL_BATCH_DOMAIN` should be used to compute
the challenge value.
matrix. The i-th cell is in row row_indices[i] and in column column_indices[i].
jtraglia marked this conversation as resolved.
Show resolved Hide resolved
The list of all commitments is provided in row_commitments_bytes.

Public method.
"""

assert len(cells) == len(proofs_bytes) == len(row_indices) == len(column_indices)
for commitment_bytes in row_commitments_bytes:
assert len(commitment_bytes) == BYTES_PER_COMMITMENT
Expand All @@ -535,18 +685,13 @@ def verify_cell_kzg_proof_batch(row_commitments_bytes: Sequence[Bytes48],
for proof_bytes in proofs_bytes:
assert len(proof_bytes) == BYTES_PER_PROOF

# Get commitments via row indices
commitments_bytes = [row_commitments_bytes[row_index] for row_index in row_indices]

# Get objects from bytes
commitments = [bytes_to_kzg_commitment(commitment_bytes) for commitment_bytes in commitments_bytes]
row_commitments = [bytes_to_kzg_commitment(commitment_bytes) for commitment_bytes in row_commitments_bytes]
cosets_evals = [cell_to_coset_evals(cell) for cell in cells]
proofs = [bytes_to_kzg_proof(proof_bytes) for proof_bytes in proofs_bytes]

return all(
verify_kzg_proof_multi_impl(commitment, coset_for_cell(column_index), coset_evals, proof)
for commitment, column_index, coset_evals, proof in zip(commitments, column_indices, cosets_evals, proofs)
)
# Do the actual verification
return verify_cell_kzg_proof_batch_impl(row_commitments, row_indices, column_indices, cosets_evals, proofs)
```

## Reconstruction
Expand All @@ -558,7 +703,7 @@ def construct_vanishing_polynomial(missing_cell_indices: Sequence[CellIndex]) ->
"""
Given the cells indices that are missing from the data, compute the polynomial that vanishes at every point that
corresponds to a missing field element.

This method assumes that all of the cells cannot be missing. In this case the vanishing polynomial
could be computed as Z(x) = x^n - 1, where `n` is FIELD_ELEMENTS_PER_EXT_BLOB.

Expand Down Expand Up @@ -617,7 +762,7 @@ def recover_data(cell_indices: Sequence[CellIndex],
extended_evaluation_times_zero = [BLSFieldElement(int(a) * int(b) % BLS_MODULUS)
for a, b in zip(zero_poly_eval, extended_evaluation)]

# Convert (E*Z)(x) to monomial form
# Convert (E*Z)(x) to monomial form
extended_evaluation_times_zero_coeffs = fft_field(extended_evaluation_times_zero, roots_of_unity_extended, inv=True)

# Convert (E*Z)(x) to evaluation form over a coset of the FFT domain
Expand Down Expand Up @@ -688,7 +833,7 @@ def recover_cells_and_kzg_proofs(cell_indices: Sequence[CellIndex],
recovered_cells = [
coset_evals_to_cell(reconstructed_data[i * FIELD_ELEMENTS_PER_CELL:(i + 1) * FIELD_ELEMENTS_PER_CELL])
for i in range(CELLS_PER_EXT_BLOB)]

polynomial_eval = reconstructed_data[:FIELD_ELEMENTS_PER_BLOB]
polynomial_coeff = polynomial_eval_to_coeff(polynomial_eval)
recovered_proofs = [None] * CELLS_PER_EXT_BLOB
Expand All @@ -700,6 +845,6 @@ def recover_cells_and_kzg_proofs(cell_indices: Sequence[CellIndex],
proof, ys = compute_kzg_proof_multi_impl(polynomial_coeff, coset)
assert coset_evals_to_cell(ys) == recovered_cells[i]
recovered_proofs[i] = proof

return recovered_cells, recovered_proofs
```
Loading