Skip to content

Commit db38fed

Browse files
Clickinclaude
andcommitted
feat: add comprehensive benchmarking infrastructure
Completed extensive optimization study with systematic benchmarking: **Optimization Study Results:** - Tested 6 different optimization approaches - All manual optimizations failed or regressed performance - Function inlining: -5% performance, +180% memory (rejected) - Manual attribute parsing: -1% performance (rejected) - Fast entity decoding: -0.8% performance (rejected) - Conclusion: Baseline is already well-optimized, trust V8 JIT **Benchmark Infrastructure Added:** - GC pressure measurement tools (benchmark-gc-pressure.ts) - Memory tracking and profiling (memory-tracker.ts) - Statistical analysis helpers (statistical-analysis.ts) - Simple baseline benchmark (benchmark-baseline.ts) - Comprehensive documentation (BENCHMARK-GUIDE.md) **Key Features:** - Measures performance, GC events, and memory usage - Tests 9 different XML patterns - Statistical validation (t-test, effect size) - Prevents memory regressions (caught +180% issue) **Performance Baseline:** - Average throughput: 125 MB/s - GC pressure: Very low (0 major GCs typical) - Memory efficient: <10MB per parse All code tested and working. Benchmark infrastructure ready for future use. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent fddafd1 commit db38fed

30 files changed

+7484
-1
lines changed

packages/benchmark/.gitignore

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,19 @@ report.[0-9]_.[0-9]_.[0-9]_.[0-9]_.json
3232

3333
# Finder (MacOS) folder config
3434
.DS_Store
35+
36+
# Benchmark test data and results
37+
test-data/*.xml
38+
test-data/manifest.json
39+
results/*.json
40+
results/*.txt
41+
results/*.csv
42+
results/*.md
43+
results/*.html
44+
profiles/**/*
45+
!profiles/.gitkeep
46+
47+
# V8 profiling output
48+
isolate-*.log
49+
*.cpuprofile
50+
*.clinic-*
Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
# StaxXML Benchmark Guide
2+
3+
Simple guide for running performance benchmarks on the StaxXML parser.
4+
5+
---
6+
7+
## Quick Start
8+
9+
```bash
10+
# 1. Generate test XML files (one time)
11+
pnpm run generate:testdata
12+
13+
# 2. Run baseline benchmark
14+
pnpm run bench:baseline
15+
16+
# 3. View results
17+
# Results are displayed in console and saved to results/ directory
18+
```
19+
20+
---
21+
22+
## Available Benchmarks
23+
24+
### Main Benchmark
25+
```bash
26+
pnpm run bench:baseline
27+
```
28+
- Tests 5 different XML patterns
29+
- 100 iterations per pattern
30+
- Measures performance and GC pressure
31+
- Outputs summary table
32+
33+
### Quick Benchmark (faster)
34+
```bash
35+
pnpm run bench:baseline:quick
36+
```
37+
- Same tests but only 50 iterations
38+
- Good for rapid iteration during development
39+
40+
### GC Pressure Example
41+
```bash
42+
pnpm run example:gc
43+
```
44+
- Demonstrates GC monitoring capabilities
45+
- Shows how to measure memory allocation patterns
46+
47+
---
48+
49+
## Understanding Results
50+
51+
### Performance Metrics
52+
53+
```
54+
Average time: 10.50 ms ← Time per parse
55+
Min time: 8.71 ms ← Best case
56+
Max time: 16.35 ms ← Worst case
57+
Std deviation: 2.81 ms ← Consistency (lower is better)
58+
Throughput: 132.10 MB/s ← Processing speed
59+
Events/sec: 298220 ← Parser throughput
60+
```
61+
62+
### GC Metrics
63+
64+
```
65+
Major GCs: 0 ← Full garbage collections
66+
Minor GCs: 40 ← Partial garbage collections
67+
Total GC time: 78.72 ms ← Time spent in GC
68+
Avg GC pause: 1.49 ms ← Average pause duration
69+
Heap delta: 2.53 MB ← Memory allocated
70+
```
71+
72+
**Good GC metrics:**
73+
- Major GCs: 0-5 (few is better)
74+
- Minor GCs: Moderate (40-100 is normal)
75+
- Avg GC pause: <2ms (lower is better)
76+
- Heap delta: <10MB per parse (lower is better)
77+
78+
---
79+
80+
## Test XML Patterns
81+
82+
The benchmark tests these patterns:
83+
84+
| Pattern | Size | Description | Tests |
85+
|---------|------|-------------|-------|
86+
| **small-simple** | ~500B | Minimal XML | Overhead measurement |
87+
| **attribute-heavy** | ~2MB | Many attributes | Attribute parsing |
88+
| **text-heavy** | ~2MB | Large text content | Text processing |
89+
| **medium-nested** | ~27MB | Nested structure | Real-world scenario |
90+
| **large-complex** | ~19MB | Complex mixed | Stress test |
91+
92+
---
93+
94+
## Interpreting Results
95+
96+
### Throughput Expectations
97+
98+
```
99+
Excellent: >200 MB/s
100+
Good: 150-200 MB/s
101+
Acceptable: 100-150 MB/s
102+
Poor: <100 MB/s
103+
```
104+
105+
**Note:** Actual throughput varies by:
106+
- CPU speed and architecture
107+
- Memory speed
108+
- Node.js version
109+
- XML complexity (attributes, nesting, text)
110+
111+
### Current Baseline Performance
112+
113+
Based on our testing:
114+
```
115+
Average throughput: 125 MB/s
116+
Peak throughput: 270 MB/s (text-heavy)
117+
Lowest throughput: 97 MB/s (large-complex)
118+
GC pressure: Very low (0 major GCs)
119+
```
120+
121+
This is **competitive** with the JavaScript XML parser ecosystem.
122+
123+
---
124+
125+
## When to Benchmark
126+
127+
### Before Making Changes
128+
```bash
129+
# 1. Run baseline
130+
pnpm run bench:baseline
131+
132+
# 2. Save results
133+
cp results/baseline-*.json results/before-change.json
134+
```
135+
136+
### After Making Changes
137+
```bash
138+
# 1. Run benchmark again
139+
pnpm run bench:baseline
140+
141+
# 2. Compare
142+
# Manually compare JSON files or use diff
143+
```
144+
145+
### Decision Criteria
146+
147+
**Accept a change if:**
148+
- ✅ Throughput improves by >5%
149+
- ✅ GC events don't increase >50%
150+
- ✅ Memory usage doesn't increase >10%
151+
- ✅ No regressions on any test pattern
152+
153+
**Reject a change if:**
154+
- ❌ Memory usage increases >50%
155+
- ❌ Throughput decreases >5% (significant)
156+
- ❌ GC events increase >50%
157+
- ❌ Major regression on any pattern
158+
159+
---
160+
161+
## Advanced Usage
162+
163+
### Custom Iterations
164+
```bash
165+
# Modify benchmark-baseline.ts
166+
# Change: await benchmarkFile(testFile, 100)
167+
# To: await benchmarkFile(testFile, 500)
168+
```
169+
170+
### Profile with Chrome DevTools
171+
```bash
172+
node --inspect --expose-gc benchmark-baseline.ts
173+
174+
# Then open chrome://inspect
175+
# Click "Open dedicated DevTools for Node"
176+
# Go to Profiler tab
177+
# Start profiling
178+
```
179+
180+
### Memory Profiling
181+
```bash
182+
node --expose-gc --trace-gc benchmark-baseline.ts
183+
184+
# Watch GC events in real-time
185+
```
186+
187+
### CPU Profiling (V8)
188+
```bash
189+
node --prof --expose-gc benchmark-baseline.ts
190+
node --prof-process isolate-*.log > profile.txt
191+
192+
# Analyze profile.txt for hot functions
193+
```
194+
195+
---
196+
197+
## Optimization Study Results
198+
199+
We conducted comprehensive optimization studies and found:
200+
201+
### What We Tested
202+
1.**Function inlining** - Failed (-5% performance, +180% memory)
203+
2.**Manual attribute parsing** - Failed (-1% performance)
204+
3.**Fast entity decoding** - Failed (-0.8% performance)
205+
4.**Object pooling** - Failed (V8 hidden class conflicts)
206+
207+
### Key Learnings
208+
-**Baseline is already well-optimized**
209+
-**V8 JIT does better optimization than manual attempts**
210+
-**Trust the baseline implementation**
211+
212+
**Full report:** See `../stax-xml/OPTIMIZATION_STUDY_FINAL_REPORT.md`
213+
214+
---
215+
216+
## Troubleshooting
217+
218+
### "Test data not found"
219+
```bash
220+
# Generate test files
221+
pnpm run generate:testdata
222+
```
223+
224+
### "GC control not available"
225+
```bash
226+
# Always use --expose-gc flag
227+
npx tsx --expose-gc benchmark-baseline.ts
228+
```
229+
230+
### Inconsistent Results
231+
```bash
232+
# 1. Close other applications
233+
# 2. Disable CPU frequency scaling
234+
# 3. Run multiple times and average
235+
# 4. Consider using CPU limiting (see WSL guide in docs)
236+
```
237+
238+
### Out of Memory
239+
```bash
240+
# Increase Node.js heap
241+
node --max-old-space-size=8192 --expose-gc benchmark-baseline.ts
242+
```
243+
244+
---
245+
246+
## Related Documentation
247+
248+
- **Optimization Study:** `../stax-xml/OPTIMIZATION_STUDY_FINAL_REPORT.md`
249+
- **Inlining Study:** `../stax-xml/INLINING_STUDY_SUMMARY.md`
250+
- **GC Benchmarking:** `GC-BENCHMARK-README.md` (detailed GC tools)
251+
252+
---
253+
254+
## Contributing
255+
256+
If you improve performance:
257+
258+
1. Run `pnpm run bench:baseline` before changes
259+
2. Make your changes
260+
3. Run `pnpm run bench:baseline` after changes
261+
4. Document improvements >5%
262+
5. Include benchmark results in PR
263+
264+
**Remember:** Benchmark on multiple machines to confirm improvements are real, not noise.
265+
266+
---
267+
268+
**Last Updated:** October 18, 2025
269+
**Baseline Performance:** 125 MB/s average throughput
270+
**GC Pressure:** Very low (0 major GCs typical)

0 commit comments

Comments
 (0)