Skip to content

perf: const_eq could be ~200 cycles faster on x86_64 with counter based implementation, specifically for U256 #545

@malik672

Description

@malik672

Description

While investigating the assembly output of const_eq, I noticed that a counter-based implementation generates significantly better code on x86_64(for U256) compared to the current boolean accumulation approach.

Current Implementation

pub const fn const_eq(&self, other: &Self) -> bool {
    let a = self.as_limbs();
    let b = other.as_limbs();
    let mut i = 0;
    let mut r = true;
    while i < LIMBS {
        r &= a[i] == b[i];
        i += 1;
    }
    r
}

Alternative Implementation

  pub const fn const_eq(&self, other: &Self) -> bool {
      let a = self.as_limbs();
      let b = other.as_limbs();
      let mut equal_count = 0usize;
      let mut i = 0;
      while i < LIMBS {
          equal_count += (a[i] == b[i]) as usize;
          i += 1;
      }
      equal_count == LIMBS
  }

Performance Impact

@prestwich I'm not reallys sure 200 cycles will translate to any diff but since it's vectorization maybe

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions