-
Notifications
You must be signed in to change notification settings - Fork 189
perf(stark): eliminate redundant bit-reversal permutations in prover … #1187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -161,6 +161,41 @@ impl<E: IsField> Polynomial<FieldElement<E>> { | |
| Polynomial::evaluate_fft::<F>(&scaled, blowup_factor, domain_size) | ||
| } | ||
|
|
||
| /// Like [`Self::evaluate_fft`], but returns evaluations in **bit-reversed order**. | ||
| /// The NR DIT FFT naturally produces bit-reversed output; this variant skips | ||
| /// the final permutation to avoid a redundant round-trip when the output is | ||
| /// destined for a Merkle tree commitment. | ||
| pub fn evaluate_fft_bitrev<F: IsFFTField + IsSubFieldOf<E>>( | ||
| poly: &Self, | ||
| blowup_factor: usize, | ||
| domain_size: Option<usize>, | ||
| ) -> Result<Vec<FieldElement<E>>, FFTError> { | ||
| let domain_size = domain_size.unwrap_or_default(); | ||
| let len = core::cmp::max(poly.coeff_len(), domain_size).next_power_of_two() * blowup_factor; | ||
| if len.trailing_zeros() as u64 > F::TWO_ADICITY { | ||
| return Err(FFTError::DomainSizeError(len.trailing_zeros() as usize)); | ||
| } | ||
| if poly.coefficients().is_empty() { | ||
| return Ok(vec![FieldElement::zero(); len]); | ||
| } | ||
|
|
||
| let mut coeffs = poly.coefficients().to_vec(); | ||
| coeffs.resize(len, FieldElement::zero()); | ||
|
|
||
| evaluate_fft_bitrev_cpu::<F, E>(&coeffs) | ||
| } | ||
|
|
||
| /// Like [`Self::evaluate_offset_fft`], but returns evaluations in **bit-reversed order**. | ||
| pub fn evaluate_offset_fft_bitrev<F: IsFFTField + IsSubFieldOf<E>>( | ||
| poly: &Self, | ||
| blowup_factor: usize, | ||
| domain_size: Option<usize>, | ||
| offset: &FieldElement<F>, | ||
| ) -> Result<Vec<FieldElement<E>>, FFTError> { | ||
| let scaled = poly.scale(offset); | ||
| Self::evaluate_fft_bitrev::<F>(&scaled, blowup_factor, domain_size) | ||
| } | ||
|
|
||
| /// Returns a new polynomial that interpolates `(w^i, fft_evals[i])`, with `w` being a | ||
| /// Nth primitive root of unity in a subfield F of E, and `i in 0..N`, with `N = fft_evals.len()`. | ||
| /// This is considered to be the inverse operation of [Self::evaluate_fft()]. | ||
|
|
@@ -313,6 +348,19 @@ where | |
| ops::fft(coeffs, &twiddles) | ||
| } | ||
|
|
||
| /// Like [`evaluate_fft_cpu`], but returns evaluations in bit-reversed order. | ||
| pub fn evaluate_fft_bitrev_cpu<F, E>( | ||
| coeffs: &[FieldElement<E>], | ||
| ) -> Result<Vec<FieldElement<E>>, FFTError> | ||
| where | ||
| F: IsFFTField + IsSubFieldOf<E>, | ||
| E: IsField, | ||
| { | ||
| let order = coeffs.len().trailing_zeros(); | ||
| let twiddles = roots_of_unity::get_twiddles::<F>(order.into(), RootsConfig::BitReverse)?; | ||
| ops::fft_bitrev(coeffs, &twiddles) | ||
| } | ||
|
|
||
| pub fn interpolate_fft_cpu<F, E>( | ||
| fft_evals: &[FieldElement<E>], | ||
| ) -> Result<Polynomial<FieldElement<E>>, FFTError> | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correctness
Security
Performance
Bugs & Errors
Code Simplicity
In summary, correctness around domain size check and constant-time assurance for security are the primary concerns. Some refactoring could resolve performance and DRY issues. Address these concerns before considering merging. |
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,11 +3,9 @@ pub mod fri_decommit; | |
| mod fri_functions; | ||
|
|
||
| use lambdaworks_crypto::fiat_shamir::is_transcript::IsStarkTranscript; | ||
| use lambdaworks_math::field::traits::IsSubFieldOf; | ||
| use lambdaworks_math::field::traits::{IsFFTField, IsField}; | ||
| use lambdaworks_math::traits::AsBytes; | ||
| use lambdaworks_math::{ | ||
| fft::cpu::bit_reversing::in_place_bit_reverse_permute, field::traits::IsSubFieldOf, | ||
| }; | ||
| pub use lambdaworks_math::{ | ||
| field::{element::FieldElement, fields::u64_prime_field::U64PrimeField}, | ||
| polynomial::Polynomial, | ||
|
|
@@ -136,9 +134,12 @@ where | |
| FieldElement<F>: AsBytes + Sync + Send, | ||
| FieldElement<E>: AsBytes + Sync + Send, | ||
| { | ||
| let mut evaluation = Polynomial::evaluate_offset_fft(poly, 1, Some(domain_size), coset_offset)?; | ||
|
|
||
| in_place_bit_reverse_permute(&mut evaluation); | ||
| // The NR DIT FFT naturally produces bit-reversed output. Use the bitrev | ||
| // variant directly to avoid the natural→bitrev round-trip that previously | ||
| // happened: evaluate_offset_fft (FFT+bitrev→natural) then | ||
| // in_place_bit_reverse_permute (natural→bitrev). | ||
| let evaluation = | ||
| Polynomial::evaluate_offset_fft_bitrev(poly, 1, Some(domain_size), coset_offset)?; | ||
|
|
||
| let mut to_commit = Vec::new(); | ||
| for chunk in evaluation.chunks(2) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correctness
Security
Performance
Bugs & Errors
Code Simplicity
Overall
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,7 +3,7 @@ use std::marker::PhantomData; | |
| use std::time::Instant; | ||
|
|
||
| use lambdaworks_crypto::fiat_shamir::is_transcript::IsStarkTranscript; | ||
| use lambdaworks_math::fft::cpu::bit_reversing::{in_place_bit_reverse_permute, reverse_index}; | ||
| use lambdaworks_math::fft::cpu::bit_reversing::reverse_index; | ||
| use lambdaworks_math::fft::errors::FFTError; | ||
|
|
||
| use lambdaworks_math::field::traits::{IsField, IsSubFieldOf}; | ||
|
|
@@ -23,7 +23,7 @@ use crate::domain::new_domain; | |
| use crate::fri; | ||
| use crate::proof::stark::{DeepPolynomialOpenings, PolynomialOpenings}; | ||
| use crate::table::Table; | ||
| use crate::trace::{columns2rows_bit_reversed, LDETraceTable}; | ||
| use crate::trace::{columns2rows, LDETraceTable}; | ||
|
|
||
| use super::config::{BatchedMerkleTree, Commitment}; | ||
| use super::constraints::evaluator::ConstraintEvaluator; | ||
|
|
@@ -212,6 +212,28 @@ where | |
| } | ||
| } | ||
|
|
||
| /// Like [`evaluate_polynomial_on_lde_domain`], but returns evaluations in **bit-reversed order**. | ||
| /// Used for Merkle tree commitments where data must be bit-reversed, avoiding the | ||
| /// natural→bitrev round-trip. | ||
| fn evaluate_polynomial_on_lde_domain_bitrev<F, E>( | ||
| p: &Polynomial<FieldElement<E>>, | ||
| blowup_factor: usize, | ||
| domain_size: usize, | ||
| offset: &FieldElement<F>, | ||
| ) -> Result<Vec<FieldElement<E>>, FFTError> | ||
| where | ||
| F: IsFFTField + IsSubFieldOf<E>, | ||
| E: IsField, | ||
| { | ||
| let evaluations = | ||
| Polynomial::evaluate_offset_fft_bitrev(p, blowup_factor, Some(domain_size), offset)?; | ||
| let step = evaluations.len() / (domain_size * blowup_factor); | ||
| match step { | ||
| 1 => Ok(evaluations), | ||
| _ => Ok(evaluations.into_iter().step_by(step).collect()), | ||
| } | ||
| } | ||
|
|
||
| /// The functionality of a STARK prover providing methods to run the STARK Prove protocol | ||
| /// https://lambdaclass.github.io/lambdaworks/starks/protocol.html | ||
| /// The default implementation is complete and is compatible with Stone prover | ||
|
|
@@ -278,12 +300,15 @@ pub trait IsStarkProver< | |
| // Interpolate columns of `trace`. | ||
| let trace_polys = trace.compute_trace_polys_main::<Field>()?; | ||
|
|
||
| // Evaluate those polynomials t_j on the large domain D_LDE. | ||
| // Evaluate polynomials on LDE domain in natural order (for constraint evaluation). | ||
| let lde_trace_evaluations = | ||
| Self::compute_lde_trace_evaluations::<Field>(&trace_polys, domain)?; | ||
|
|
||
| // Compute commitment using fused bit-reverse + transpose (avoids cloning) | ||
| let lde_trace_permuted_rows = columns2rows_bit_reversed(&lde_trace_evaluations); | ||
| // Evaluate in bit-reversed order directly (for Merkle tree commitment). | ||
| // This avoids the natural→bitrev round-trip that columns2rows_bit_reversed did. | ||
| let lde_trace_bitrev = | ||
| Self::compute_lde_trace_evaluations_bitrev::<Field>(&trace_polys, domain)?; | ||
| let lde_trace_permuted_rows = columns2rows(lde_trace_bitrev); | ||
|
|
||
| let (lde_trace_merkle_tree, lde_trace_merkle_root) = | ||
| Self::batch_commit_main(&lde_trace_permuted_rows) | ||
|
|
@@ -329,11 +354,12 @@ pub trait IsStarkProver< | |
| // Interpolate columns of `trace`. | ||
| let trace_polys = trace.compute_trace_polys_aux::<Field>()?; | ||
|
|
||
| // Evaluate those polynomials t_j on the large domain D_LDE. | ||
| // Evaluate polynomials on LDE domain in natural order (for constraint evaluation). | ||
| let lde_trace_evaluations = Self::compute_lde_trace_evaluations(&trace_polys, domain)?; | ||
|
|
||
| // Compute commitment using fused bit-reverse + transpose (avoids cloning) | ||
| let lde_trace_permuted_rows = columns2rows_bit_reversed(&lde_trace_evaluations); | ||
| // Evaluate in bit-reversed order directly (for Merkle tree commitment). | ||
| let lde_trace_bitrev = Self::compute_lde_trace_evaluations_bitrev(&trace_polys, domain)?; | ||
| let lde_trace_permuted_rows = columns2rows(lde_trace_bitrev); | ||
|
|
||
| let (lde_trace_merkle_tree, lde_trace_merkle_root) = | ||
| Self::batch_commit_extension(&lde_trace_permuted_rows) | ||
|
|
@@ -377,6 +403,33 @@ pub trait IsStarkProver< | |
| .collect::<Result<Vec<Vec<FieldElement<E>>>, FFTError>>() | ||
| } | ||
|
|
||
| /// Like [`Self::compute_lde_trace_evaluations`], but returns columns in bit-reversed order. | ||
| /// Used for Merkle tree commitment where the leaves must be in bit-reversed order. | ||
| fn compute_lde_trace_evaluations_bitrev<E>( | ||
| trace_polys: &[Polynomial<FieldElement<E>>], | ||
| domain: &Domain<Field>, | ||
| ) -> Result<Vec<Vec<FieldElement<E>>>, FFTError> | ||
| where | ||
| E: IsSubFieldOf<FieldExtension>, | ||
| Field: IsSubFieldOf<E>, | ||
| { | ||
| #[cfg(not(feature = "parallel"))] | ||
| let trace_polys_iter = trace_polys.iter(); | ||
| #[cfg(feature = "parallel")] | ||
| let trace_polys_iter = trace_polys.par_iter(); | ||
|
|
||
| trace_polys_iter | ||
| .map(|poly| { | ||
| evaluate_polynomial_on_lde_domain_bitrev( | ||
| poly, | ||
| domain.blowup_factor, | ||
| domain.interpolation_domain_size, | ||
| &domain.coset_offset, | ||
| ) | ||
| }) | ||
| .collect::<Result<Vec<Vec<FieldElement<E>>>, FFTError>>() | ||
| } | ||
|
|
||
| /// Returns the result of the first round of the STARK Prove protocol. | ||
| fn round_1_randomized_air_with_preprocessing( | ||
| air: &dyn AIR<Field = Field, FieldExtension = FieldExtension, PublicInputs = PI>, | ||
|
|
@@ -429,54 +482,45 @@ pub trait IsStarkProver< | |
| }) | ||
| } | ||
|
|
||
| /// Returns the Merkle tree and the commitment to the evaluations of the parts of the | ||
| /// composition polynomial. | ||
| /// Commits to composition polynomial part evaluations that are already in bit-reversed order. | ||
| /// Transposes columns to rows, pairs consecutive rows, and builds the Merkle tree. | ||
| /// | ||
| /// Returns `None` if: | ||
| /// - `lde_composition_poly_parts_evaluations` is empty | ||
| /// - Any part has different length than others | ||
| /// - The LDE length is not even (required for pairing evaluations) | ||
| fn commit_composition_polynomial( | ||
| lde_composition_poly_parts_evaluations: &[Vec<FieldElement<FieldExtension>>], | ||
| /// Returns `None` if the input is empty, has zero-length parts, odd-length parts, or | ||
| /// parts of different lengths. | ||
| fn commit_composition_polynomial_bitrev( | ||
| lde_composition_poly_parts_bitrev: &[Vec<FieldElement<FieldExtension>>], | ||
| ) -> Option<(BatchedMerkleTree<FieldExtension>, Commitment)> | ||
| where | ||
| FieldElement<Field>: AsBytes + Sync + Send, | ||
| FieldElement<FieldExtension>: AsBytes + Sync + Send, | ||
| { | ||
| // Validate input: must have at least one part with at least one evaluation | ||
| let first_part = lde_composition_poly_parts_evaluations.first()?; | ||
| let first_part = lde_composition_poly_parts_bitrev.first()?; | ||
| let lde_len = first_part.len(); | ||
|
|
||
| if lde_len == 0 { | ||
| return None; | ||
| } | ||
|
|
||
| // LDE length must be even for the chunking below to work correctly | ||
| if lde_len % 2 != 0 { | ||
| if lde_len == 0 || lde_len % 2 != 0 { | ||
| return None; | ||
| } | ||
|
|
||
| // All parts must have the same length | ||
| if !lde_composition_poly_parts_evaluations | ||
| if !lde_composition_poly_parts_bitrev | ||
| .iter() | ||
| .all(|part| part.len() == lde_len) | ||
| { | ||
| return None; | ||
| } | ||
|
|
||
| let num_parts = lde_composition_poly_parts_evaluations.len(); | ||
| let mut lde_composition_poly_evaluations: Vec<Vec<_>> = (0..lde_len) | ||
| // Transpose columns to rows — data is already bit-reversed, no permutation needed. | ||
| let num_parts = lde_composition_poly_parts_bitrev.len(); | ||
| let lde_composition_poly_evaluations: Vec<Vec<_>> = (0..lde_len) | ||
| .map(|i| { | ||
| let mut row = Vec::with_capacity(num_parts); | ||
| for evaluation in lde_composition_poly_parts_evaluations.iter() { | ||
| for evaluation in lde_composition_poly_parts_bitrev.iter() { | ||
| row.push(evaluation[i].clone()); | ||
| } | ||
| row | ||
| }) | ||
| .collect(); | ||
|
|
||
| in_place_bit_reverse_permute(&mut lde_composition_poly_evaluations); | ||
|
|
||
| // Pair consecutive rows for Merkle leaves. | ||
| let mut lde_composition_poly_evaluations_merged = Vec::with_capacity(lde_len / 2); | ||
| let mut iter = lde_composition_poly_evaluations.into_iter(); | ||
| while let (Some(mut chunk0), Some(chunk1)) = (iter.next(), iter.next()) { | ||
|
|
@@ -526,6 +570,7 @@ pub trait IsStarkProver< | |
| } | ||
| let composition_poly_parts = composition_poly.break_in_parts(number_of_parts); | ||
|
|
||
| // Natural-order evaluations (used for opening queries via reverse_index). | ||
| let lde_composition_poly_parts_evaluations: Vec<_> = composition_poly_parts | ||
| .iter() | ||
| .map(|part| { | ||
|
|
@@ -538,8 +583,21 @@ pub trait IsStarkProver< | |
| }) | ||
| .collect::<Result<Vec<_>, _>>()?; | ||
|
|
||
| // Bit-reversed evaluations (for Merkle tree commitment — avoids natural→bitrev round-trip). | ||
| let lde_composition_poly_parts_bitrev: Vec<_> = composition_poly_parts | ||
| .iter() | ||
| .map(|part| { | ||
| evaluate_polynomial_on_lde_domain_bitrev( | ||
| part, | ||
| domain.blowup_factor, | ||
| domain.interpolation_domain_size, | ||
| &domain.coset_offset, | ||
| ) | ||
| }) | ||
| .collect::<Result<Vec<_>, _>>()?; | ||
|
|
||
| let Some((composition_poly_merkle_tree, composition_poly_root)) = | ||
| Self::commit_composition_polynomial(&lde_composition_poly_parts_evaluations) | ||
| Self::commit_composition_polynomial_bitrev(&lde_composition_poly_parts_bitrev) | ||
| else { | ||
| return Err(ProvingError::EmptyCommitment); | ||
| }; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correctness:
Security:
Bugs & Errors:
Performance:
Code Simplicity:
Overall, the proposed changes introduce modifications in FFT and polynomial evaluations while aiming to optimize bit-reversal requirements. These need careful verification in practice to ensure no edge cases or unintended errors arise during both typical and boundary-case usage. |
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness
in_place_nr_2radix_fftmaintains the correctness properties of the FFT, especially the handling of edge cases like inputs of size one or power-of-two constraints. The call tois_power_of_two()should ensure this, but audit the helper function for correctness in recursion and base cases.Security
in_place_nr_2radix_fftoperates in constant time. If the algorithm's operations depend on secret data, ensure it does not lead to timing side-channel vulnerabilities.results, ensure it is zeroized to prevent unintended disclosure.Performance
input.to_vec()results in an allocation. Ensure this conversion is necessary, and consider using a different approach ifinputcan be operated on in-place.twiddlesare precomputed efficiently, and all inverse calculations (like inverses and field operations in the FFT) are optimized.Bugs & Errors
resultsandtwiddlesfor potential off-by-one errors.FFTError::InputError(input.len())is returned for non-power-of-two sizes, ensure this is a comprehensive error management strategy for other potential issues within the FFT process.Code Simplicity
fft_bitrevshould be used overfft.In summary, verify correctness in edge cases, constant-time operation of FFT logic, and efficient handling of inputs without unnecessary allocations. Enhanced documentation to aid understanding of when
fft_bitrevshould be preferred is also recommended before merging.