Skip to content

Performance tests Fabric

Abhijeet Jha edited this page Feb 25, 2026 · 8 revisions

Performance Testing for React Native Windows

Each test measures mount, unmount, re-render and component specific test scenario times also records baselines via snapshot matching.

How to Run

cd packages/e2e-test-app-fabric
yarn perf                                              # all tests
yarn perf -- --testPathPattern=FlatList                 # single component
yarn perf:update                                       # update all baselines
yarn perf:update --testPathPattern=TouchableHighlight   # update one baseline

CLI Generator

yarn perf:create -- --name=ComponentName scaffolds a new .perf-test.tsx file with the correct structure, required props detection, and category selection.

What's Included

Perf Testing Framework (@react-native-windows/perf-testing)

Module Purpose
ComponentPerfTestBase Abstract base class — components provide props/scenarios, framework handles measurement
measurePerf() Core timing engine using React.Profiler (monotonic high-resolution clock, QueryPerformanceCounter on windows OS)
toMatchPerfSnapshot() Custom Jest matcher — compares results against .perf-baseline.json snapshots
PerfProfiler React wrapper that captures actualDuration from Profiler callbacks
snapshotManager Reads/writes/updates baseline JSON files
Threshold presets strict, standard, relaxed, ci — configurable tolerance bands
Scenarios MountScenario, UnmountScenario, RerenderScenario
Reporters ConsoleReporter, MarkdownReporter, PerfJsonReporter
CI utilities BaselineComparator for CI regression detection

Core Components (9 suites)

Component Scenario Baseline (ms) Max Regression Min Δ (ms) Notes
View mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
nested-views-50 4 15 % 5 heavier DOM
nested-views-100 7 15 % 5 heavier DOM
stress-views-500 10 10 % 10 stress gate
with-shadow 0 10 % 3
with-border-radius 0 10 % 3
Text mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
long-text-1000 0 10 % 3
nested-text 0 10 % 3
styled-text 0 10 % 3
multiple-text-100 7 10 % 10 stress gate
Image mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
with-resize-mode 0 10 % 3
with-border-radius 0 10 % 3
with-tint-color 0 10 % 3
with-blur-radius 0 10 % 3
with-accessibility 0 10 % 3
multiple-images-10 1 10 % 5 bulk noise
multiple-images-50 4 15 % 5 bulk noise
multiple-images-100 8 10 % 10 stress gate
TextInput mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
multiline 0 10 % 3
with-value 0 10 % 3
styled-input 0 10 % 3
multiple-text-inputs-100 7 10 % 10 stress gate
Switch mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
disabled 0.5 10 % 3
custom-colors 0 10 % 3
multiple-switches-10 1 10 % 5 bulk noise
multiple-switches-50 8 15 % 5 bulk noise
multiple-switches-100 16 10 % 10 stress gate
Button mount 1 10 % 3
unmount 0 10 % 3
rerender 1 10 % 3
disabled 1 10 % 3
with-color 1 10 % 3
with-accessibility 1 10 % 3
multiple-buttons-10 5 10 % 5 bulk noise
multiple-buttons-50 26 15 % 5 bulk noise
multiple-buttons-100 19 10 % 10 stress gate
ActivityIndicator mount 0 10 % 3
unmount 0 10 % 3
rerender 0 10 % 3
multiple-indicators-10 1 10 % 5 bulk noise
multiple-indicators-50 4 15 % 5 bulk noise
multiple-indicators-100 7 10 % 10 stress gate
ScrollView mount 0 10 % 3
unmount 0 10 % 3
rerender 0.5 10 % 3
with-children-20 3 10 % 3
with-children-100 15 15 % 5 heavy
horizontal 3 10 % 3
sticky-headers 3 10 % 3
nested-scroll-views 1 10 % 5
with-children-500 19 10 % 10 stress gate
Modal mount 0 10 % 3
unmount 0 10 % 3
rerender 0.5 10 % 3
with-rich-content 2 10 % 3

Interactive Components (3 suites)

Component Scenario Baseline (ms) Max Regression Min Δ (ms) Notes
Pressable mount 0.5 10 % 3
unmount 0 10 % 3
rerender 1 10 % 3
nested-pressables 1 10 % 3
multiple-pressables-10 3 10 % 5 bulk noise
multiple-pressables-50 15 15 % 5 bulk noise
multiple-pressables-100 12 10 % 10 stress gate
TouchableOpacity mount 1 10 % 3
rerender 1.5 10 % 3
nested-touchables 1.5 10 % 3
multiple-touchables-10 6 10 % 5 bulk noise
multiple-touchables-50 29 15 % 5 bulk noise
multiple-touchables-100 30 10 % 10 stress gate
TouchableHighlight mount 1 10 % 3
rerender 0.5 10 % 3
nested-touchables 1 10 % 3
multiple-touchables-10 2 10 % 5 bulk noise
multiple-touchables-50 12.5 15 % 5 bulk noise
multiple-touchables-100 22.5 10 % 10 stress gate

List Components (2 suites)

Component Scenario Baseline (ms) Max Regression Min Δ (ms) Notes
FlatList mount 4 10 % 3
unmount 0 10 % 3
rerender 9 10 % 3
with-10-items 4 10 % 3
with-100-items 5 10 % 5
with-500-items 5 15 % 10 large list
horizontal 4.5 10 % 5
with-separator 6 10 % 5
with-header-footer 2 10 % 5
with-empty-list 1 10 % 3
with-get-item-layout 2 10 % 5
inverted 2 10 % 5
with-1000-items 4 15 % 10 stress gate (virtualized)
with-num-columns 3 10 % 5
SectionList mount 5 10 % 3
unmount 0 10 % 3
rerender 11 10 % 3
3-sections × 5-items 5 10 % 5
5-sections × 10-items 6 10 % 5
10-sections × 20-items 5.5 15 % 10 200 items
20-sections × 10-items 5.5 15 % 10 200 items
with-section-separator 10 % 5
with-item-separator 10 % 5
with-header-footer 10 % 5
with-section-footer 10 % 5
sticky-section-headers 10 % 5
with-50-sections-20-items 2 15 % 10 stress gate (virtualized)
with-empty-list 0 10 % 3

Test Architecture

ComponentPerfTest extends ComponentPerfTestBase
  ├── componentName()   → 'FlatList'
  ├── baseProps()       → minimal valid props
  ├── scenarios()       → [{ name, props, description }]
  └── (optional) renderChildren(), wrapComponent()

measurePerf(test, scenario)
  ├── renders via PerfProfiler
  ├── captures actualDuration from React.Profiler
  ├── runs mount / unmount / rerender cycles
  └── returns PerfMetrics

expect(metrics).toMatchPerfSnapshot()
  ├── loads/creates .perf-baseline.json
  ├── compares against thresholds
  └── updates baseline when --updateSnapshot

==========================================

Folder Structure

packages/e2e-test-app-fabric/
├── jest.perf.config.js                          # Jest config (maxWorkers:1, .perf-test pattern)
├── jest.perf.setup.ts                           # Test setup (registers toMatchPerfSnapshot matcher)
└── test/__perf__/
    ├── core/                                    # 9 core component tests
    │   ├── View.perf-test.tsx
    │   ├── Text.perf-test.tsx
    │   ├── TextInput.perf-test.tsx
    │   ├── Button.perf-test.tsx
    │   ├── Image.perf-test.tsx
    │   ├── ScrollView.perf-test.tsx
    │   ├── Switch.perf-test.tsx
    │   ├── Modal.perf-test.tsx
    │   ├── ActivityIndicator.perf-test.tsx
    │   └── __perf_snapshots__/                  # Baseline JSONs (one per test)
    ├── interactive/                             # 3 interactive component tests
    │   ├── Pressable.perf-test.tsx
    │   ├── TouchableOpacity.perf-test.tsx
    │   ├── TouchableHighlight.perf-test.tsx
    │   └── __perf_snapshots__/
    └── list/                                    # 2 list component tests
        ├── FlatList.perf-test.tsx
        ├── SectionList.perf-test.tsx
        └── __perf_snapshots__/

packages/@react-native-windows/perf-testing/src/
├── index.ts                                     # Public API exports
├── base/
│   └── ComponentPerfTestBase.ts                 # Abstract base class for tests
├── core/
│   ├── measurePerf.ts                           # Timing engine (React.Profiler + performance.now)
│   ├── PerfProfiler.tsx                         # React.Profiler wrapper
│   └── statistics.ts                            # mean, median, stdDev
├── interfaces/
│   ├── IComponentPerfTest.ts                    # Test interface contract
│   ├── PerfMetrics.ts                           # Result shape
│   └── PerfThreshold.ts                         # Threshold config shape
├── matchers/
│   ├── toMatchPerfSnapshot.ts                   # Custom Jest matcher
│   └── snapshotManager.ts                       # Baseline file read/write
├── scenarios/
│   ├── MountScenario.ts
│   ├── UnmountScenario.ts
│   └── RerenderScenario.ts
├── config/
│   ├── defaultConfig.ts                         # Default runs, warmup, thresholds
│   └── thresholdPresets.ts                      # strict / standard / relaxed / ci
├── reporters/
│   ├── ConsoleReporter.ts                       # Terminal output
│   └── MarkdownReporter.ts                      # .md report generation
└── ci/
    ├── PerfJsonReporter.ts                      # JSON results for CI artifacts
    └── BaselineComparator.ts                    # Regression detection

vnext/Scripts/perf/
├── create-perf-test.js                          # CLI scaffold generator (yarn perf:create)
├── compare-results.js                           # CI baseline comparison
└── post-pr-comment.js                           # GitHub PR comment poster

Approach Used: Statistical Stability Model

Simple median-vs-median comparison is unreliable at millisecond scale — system noise causes random failures. We use three gates, each proven in production at scale:

  1. CV Gate — If coefficient of variation (stdDev/mean) exceeds threshold, the measurement is too noisy to compare; warn instead of fail.

  2. Mann-Whitney U Test — Non-parametric rank-based hypothesis test on raw durations[]. Only flags a regression when the difference is statistically significant (p < 0.05), not just numerically larger. Chosen over t-test because perf data is rarely normally distributed.

  3. Gate / Track Mode — Stable tests use gate (block CI). Inherently variable bulk scenarios use track (warn only, never block).

References

Who What Link
Google Chrome Catapult uses Mann-Whitney U in getDifferenceSignificance() to detect perf regressions across histogram samples chromium.googlesource.com/catapult — How to Write Metrics
Microsoft BenchmarkDotNet uses CV-based noise detection and statistical significance gates to suppress unstable benchmark results benchmarkdotnet.org — How It Works
Meta React Profiler actualDuration provides component-level render timing — the same API this framework builds on react.dev — Profiler API

Benchmark Gates & Regression Thresholds

How a regression is detected

Every test result passes through this decision pipeline (in toMatchPerfSnapshot and BaselineComparator):

measured durations[]
       │
       ▼
  CV > maxCV?  ──yes──▶  SKIP (too noisy to judge) → warn only
       │ no
       ▼
  Mann-Whitney U
  p ≥ 0.05?   ──yes──▶  PASS (not statistically significant)
       │ no
       ▼
  % change > maxDurationIncrease%
  AND absolute Δ > minAbsoluteDelta ms?
       │
    yes │ no
       ▼    ▼
  mode?    PASS
   │
  gate ──▶ FAIL CI
  track ─▶ WARN only

Both the percentage and absolute delta gates must trip simultaneously. This prevents a 1 ms → 2 ms jump (100 % but only 1 ms) from blocking CI.

Threshold Presets

Preset Max Regression Min Δ (ms) Max Renders Min Runs Max CV Mode
core 10 % 3 2 10 0.40 gate
list 15 % 5 5 5 0.50 gate
interactive 20 % 5 10 10 0.50 gate
community 25 % 5 15 5 0.60 track
default 10 % 3 5 10 0.50 gate

Clone this wiki locally