-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Performance tests Fabric
Each test measures mount, unmount, re-render and component specific test scenario times also records baselines via snapshot matching.
cd packages/e2e-test-app-fabric
yarn perf # all tests
yarn perf -- --testPathPattern=FlatList # single component
yarn perf:update # update all baselines
yarn perf:update --testPathPattern=TouchableHighlight # update one baselineyarn perf:create -- --name=ComponentName scaffolds a new .perf-test.tsx file with the correct structure, required props detection, and category selection.
| Module | Purpose |
|---|---|
ComponentPerfTestBase |
Abstract base class — components provide props/scenarios, framework handles measurement |
measurePerf() |
Core timing engine using React.Profiler (monotonic high-resolution clock, QueryPerformanceCounter on windows OS) |
toMatchPerfSnapshot() |
Custom Jest matcher — compares results against .perf-baseline.json snapshots |
PerfProfiler |
React wrapper that captures actualDuration from Profiler callbacks |
snapshotManager |
Reads/writes/updates baseline JSON files |
| Threshold presets |
strict, standard, relaxed, ci — configurable tolerance bands |
| Scenarios |
MountScenario, UnmountScenario, RerenderScenario
|
| Reporters |
ConsoleReporter, MarkdownReporter, PerfJsonReporter
|
| CI utilities |
BaselineComparator for CI regression detection |
| Component | Scenario | Baseline (ms) | Max Regression | Min Δ (ms) | Notes |
|---|---|---|---|---|---|
| View | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| nested-views-50 | 4 | 15 % | 5 | heavier DOM | |
| nested-views-100 | 7 | 15 % | 5 | heavier DOM | |
| stress-views-500 | 10 | 10 % | 10 | stress gate | |
| with-shadow | 0 | 10 % | 3 | ||
| with-border-radius | 0 | 10 % | 3 | ||
| Text | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| long-text-1000 | 0 | 10 % | 3 | ||
| nested-text | 0 | 10 % | 3 | ||
| styled-text | 0 | 10 % | 3 | ||
| multiple-text-100 | 7 | 10 % | 10 | stress gate | |
| Image | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| with-resize-mode | 0 | 10 % | 3 | ||
| with-border-radius | 0 | 10 % | 3 | ||
| with-tint-color | 0 | 10 % | 3 | ||
| with-blur-radius | 0 | 10 % | 3 | ||
| with-accessibility | 0 | 10 % | 3 | ||
| multiple-images-10 | 1 | 10 % | 5 | bulk noise | |
| multiple-images-50 | 4 | 15 % | 5 | bulk noise | |
| multiple-images-100 | 8 | 10 % | 10 | stress gate | |
| TextInput | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| multiline | 0 | 10 % | 3 | ||
| with-value | 0 | 10 % | 3 | ||
| styled-input | 0 | 10 % | 3 | ||
| multiple-text-inputs-100 | 7 | 10 % | 10 | stress gate | |
| Switch | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| disabled | 0.5 | 10 % | 3 | ||
| custom-colors | 0 | 10 % | 3 | ||
| multiple-switches-10 | 1 | 10 % | 5 | bulk noise | |
| multiple-switches-50 | 8 | 15 % | 5 | bulk noise | |
| multiple-switches-100 | 16 | 10 % | 10 | stress gate | |
| Button | mount | 1 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 1 | 10 % | 3 | ||
| disabled | 1 | 10 % | 3 | ||
| with-color | 1 | 10 % | 3 | ||
| with-accessibility | 1 | 10 % | 3 | ||
| multiple-buttons-10 | 5 | 10 % | 5 | bulk noise | |
| multiple-buttons-50 | 26 | 15 % | 5 | bulk noise | |
| multiple-buttons-100 | 19 | 10 % | 10 | stress gate | |
| ActivityIndicator | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0 | 10 % | 3 | ||
| multiple-indicators-10 | 1 | 10 % | 5 | bulk noise | |
| multiple-indicators-50 | 4 | 15 % | 5 | bulk noise | |
| multiple-indicators-100 | 7 | 10 % | 10 | stress gate | |
| ScrollView | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0.5 | 10 % | 3 | ||
| with-children-20 | 3 | 10 % | 3 | ||
| with-children-100 | 15 | 15 % | 5 | heavy | |
| horizontal | 3 | 10 % | 3 | ||
| sticky-headers | 3 | 10 % | 3 | ||
| nested-scroll-views | 1 | 10 % | 5 | ||
| with-children-500 | 19 | 10 % | 10 | stress gate | |
| Modal | mount | 0 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 0.5 | 10 % | 3 | ||
| with-rich-content | 2 | 10 % | 3 |
| Component | Scenario | Baseline (ms) | Max Regression | Min Δ (ms) | Notes |
|---|---|---|---|---|---|
| Pressable | mount | 0.5 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 1 | 10 % | 3 | ||
| nested-pressables | 1 | 10 % | 3 | ||
| multiple-pressables-10 | 3 | 10 % | 5 | bulk noise | |
| multiple-pressables-50 | 15 | 15 % | 5 | bulk noise | |
| multiple-pressables-100 | 12 | 10 % | 10 | stress gate | |
| TouchableOpacity | mount | 1 | 10 % | 3 | |
| rerender | 1.5 | 10 % | 3 | ||
| nested-touchables | 1.5 | 10 % | 3 | ||
| multiple-touchables-10 | 6 | 10 % | 5 | bulk noise | |
| multiple-touchables-50 | 29 | 15 % | 5 | bulk noise | |
| multiple-touchables-100 | 30 | 10 % | 10 | stress gate | |
| TouchableHighlight | mount | 1 | 10 % | 3 | |
| rerender | 0.5 | 10 % | 3 | ||
| nested-touchables | 1 | 10 % | 3 | ||
| multiple-touchables-10 | 2 | 10 % | 5 | bulk noise | |
| multiple-touchables-50 | 12.5 | 15 % | 5 | bulk noise | |
| multiple-touchables-100 | 22.5 | 10 % | 10 | stress gate |
| Component | Scenario | Baseline (ms) | Max Regression | Min Δ (ms) | Notes |
|---|---|---|---|---|---|
| FlatList | mount | 4 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 9 | 10 % | 3 | ||
| with-10-items | 4 | 10 % | 3 | ||
| with-100-items | 5 | 10 % | 5 | ||
| with-500-items | 5 | 15 % | 10 | large list | |
| horizontal | 4.5 | 10 % | 5 | ||
| with-separator | 6 | 10 % | 5 | ||
| with-header-footer | 2 | 10 % | 5 | ||
| with-empty-list | 1 | 10 % | 3 | ||
| with-get-item-layout | 2 | 10 % | 5 | ||
| inverted | 2 | 10 % | 5 | ||
| with-1000-items | 4 | 15 % | 10 | stress gate (virtualized) | |
| with-num-columns | 3 | 10 % | 5 | ||
| SectionList | mount | 5 | 10 % | 3 | |
| unmount | 0 | 10 % | 3 | ||
| rerender | 11 | 10 % | 3 | ||
| 3-sections × 5-items | 5 | 10 % | 5 | ||
| 5-sections × 10-items | 6 | 10 % | 5 | ||
| 10-sections × 20-items | 5.5 | 15 % | 10 | 200 items | |
| 20-sections × 10-items | 5.5 | 15 % | 10 | 200 items | |
| with-section-separator | — | 10 % | 5 | ||
| with-item-separator | — | 10 % | 5 | ||
| with-header-footer | — | 10 % | 5 | ||
| with-section-footer | — | 10 % | 5 | ||
| sticky-section-headers | — | 10 % | 5 | ||
| with-50-sections-20-items | 2 | 15 % | 10 | stress gate (virtualized) | |
| with-empty-list | 0 | 10 % | 3 |
ComponentPerfTest extends ComponentPerfTestBase
├── componentName() → 'FlatList'
├── baseProps() → minimal valid props
├── scenarios() → [{ name, props, description }]
└── (optional) renderChildren(), wrapComponent()
measurePerf(test, scenario)
├── renders via PerfProfiler
├── captures actualDuration from React.Profiler
├── runs mount / unmount / rerender cycles
└── returns PerfMetrics
expect(metrics).toMatchPerfSnapshot()
├── loads/creates .perf-baseline.json
├── compares against thresholds
└── updates baseline when --updateSnapshot
==========================================
packages/e2e-test-app-fabric/
├── jest.perf.config.js # Jest config (maxWorkers:1, .perf-test pattern)
├── jest.perf.setup.ts # Test setup (registers toMatchPerfSnapshot matcher)
└── test/__perf__/
├── core/ # 9 core component tests
│ ├── View.perf-test.tsx
│ ├── Text.perf-test.tsx
│ ├── TextInput.perf-test.tsx
│ ├── Button.perf-test.tsx
│ ├── Image.perf-test.tsx
│ ├── ScrollView.perf-test.tsx
│ ├── Switch.perf-test.tsx
│ ├── Modal.perf-test.tsx
│ ├── ActivityIndicator.perf-test.tsx
│ └── __perf_snapshots__/ # Baseline JSONs (one per test)
├── interactive/ # 3 interactive component tests
│ ├── Pressable.perf-test.tsx
│ ├── TouchableOpacity.perf-test.tsx
│ ├── TouchableHighlight.perf-test.tsx
│ └── __perf_snapshots__/
└── list/ # 2 list component tests
├── FlatList.perf-test.tsx
├── SectionList.perf-test.tsx
└── __perf_snapshots__/
packages/@react-native-windows/perf-testing/src/
├── index.ts # Public API exports
├── base/
│ └── ComponentPerfTestBase.ts # Abstract base class for tests
├── core/
│ ├── measurePerf.ts # Timing engine (React.Profiler + performance.now)
│ ├── PerfProfiler.tsx # React.Profiler wrapper
│ └── statistics.ts # mean, median, stdDev
├── interfaces/
│ ├── IComponentPerfTest.ts # Test interface contract
│ ├── PerfMetrics.ts # Result shape
│ └── PerfThreshold.ts # Threshold config shape
├── matchers/
│ ├── toMatchPerfSnapshot.ts # Custom Jest matcher
│ └── snapshotManager.ts # Baseline file read/write
├── scenarios/
│ ├── MountScenario.ts
│ ├── UnmountScenario.ts
│ └── RerenderScenario.ts
├── config/
│ ├── defaultConfig.ts # Default runs, warmup, thresholds
│ └── thresholdPresets.ts # strict / standard / relaxed / ci
├── reporters/
│ ├── ConsoleReporter.ts # Terminal output
│ └── MarkdownReporter.ts # .md report generation
└── ci/
├── PerfJsonReporter.ts # JSON results for CI artifacts
└── BaselineComparator.ts # Regression detection
vnext/Scripts/perf/
├── create-perf-test.js # CLI scaffold generator (yarn perf:create)
├── compare-results.js # CI baseline comparison
└── post-pr-comment.js # GitHub PR comment poster
Simple median-vs-median comparison is unreliable at millisecond scale — system noise causes random failures. We use three gates, each proven in production at scale:
-
CV Gate — If coefficient of variation (
stdDev/mean) exceeds threshold, the measurement is too noisy to compare; warn instead of fail. -
Mann-Whitney U Test — Non-parametric rank-based hypothesis test on raw
durations[]. Only flags a regression when the difference is statistically significant (p < 0.05), not just numerically larger. Chosen over t-test because perf data is rarely normally distributed. -
Gate / Track Mode — Stable tests use
gate(block CI). Inherently variable bulk scenarios usetrack(warn only, never block).
| Who | What | Link |
|---|---|---|
Chrome Catapult uses Mann-Whitney U in getDifferenceSignificance() to detect perf regressions across histogram samples |
chromium.googlesource.com/catapult — How to Write Metrics | |
| Microsoft | BenchmarkDotNet uses CV-based noise detection and statistical significance gates to suppress unstable benchmark results | benchmarkdotnet.org — How It Works |
| Meta | React Profiler actualDuration provides component-level render timing — the same API this framework builds on |
react.dev — Profiler API |
Every test result passes through this decision pipeline (in toMatchPerfSnapshot and BaselineComparator):
measured durations[]
│
▼
CV > maxCV? ──yes──▶ SKIP (too noisy to judge) → warn only
│ no
▼
Mann-Whitney U
p ≥ 0.05? ──yes──▶ PASS (not statistically significant)
│ no
▼
% change > maxDurationIncrease%
AND absolute Δ > minAbsoluteDelta ms?
│
yes │ no
▼ ▼
mode? PASS
│
gate ──▶ FAIL CI
track ─▶ WARN only
Both the percentage and absolute delta gates must trip simultaneously. This prevents a 1 ms → 2 ms jump (100 % but only 1 ms) from blocking CI.
| Preset | Max Regression | Min Δ (ms) | Max Renders | Min Runs | Max CV | Mode |
|---|---|---|---|---|---|---|
| core | 10 % | 3 | 2 | 10 | 0.40 | gate |
| list | 15 % | 5 | 5 | 5 | 0.50 | gate |
| interactive | 20 % | 5 | 10 | 10 | 0.50 | gate |
| community | 25 % | 5 | 15 | 5 | 0.60 | track |
| default | 10 % | 3 | 5 | 10 | 0.50 | gate |