Skip to content

Commit bd88f6a

Browse files
Copilothaochengxia
andcommitted
Add comprehensive profiling documentation and cleanup
Co-authored-by: haochengxia <44400703+haochengxia@users.noreply.github.com>
1 parent 8be9730 commit bd88f6a

14 files changed

+393
-1342
lines changed

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,13 @@ lib/
1818
lib64/
1919
parts/
2020
sdist/
21+
22+
# Profiling results
23+
profiling_*results/
24+
profile_*.txt
25+
profile_*.json
26+
profile_*.csv
27+
performance_report_*.txt
2128
var/
2229
wheels/
2330
share/python-wheels/

docs/profiling.md

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
# Performance Profiling for libCacheSim
2+
3+
This document explains how to use the profiling utilities in libCacheSim to analyze performance bottlenecks in cache simulations, particularly plugin caches and trace processing operations.
4+
5+
## Overview
6+
7+
The `libcachesim.profiler` module provides comprehensive profiling capabilities using Python's built-in `cProfile` along with memory and CPU monitoring. This is particularly useful for:
8+
9+
- Analyzing plugin cache performance bottlenecks
10+
- Comparing different cache implementations
11+
- Understanding where time is spent in trace processing
12+
- Identifying memory usage patterns
13+
- Getting detailed function-level performance data
14+
15+
## Quick Start
16+
17+
### Basic Function Profiling
18+
19+
```python
20+
from libcachesim.profiler import profile_cache_operations, profile_trace_processing
21+
22+
# Profile cache operations (individual requests)
23+
result = profile_cache_operations(cache, reader, num_requests=1000)
24+
print(f"Time: {result.execution_time:.4f}s, Memory: {result.memory_peak:.2f}MB")
25+
26+
# Profile complete trace processing
27+
result = profile_trace_processing(cache, reader)
28+
print(f"Miss ratio: {result.custom_metrics['obj_miss_ratio']:.4f}")
29+
```
30+
31+
### Advanced Profiling with CacheSimProfiler
32+
33+
```python
34+
from libcachesim.profiler import CacheSimProfiler
35+
36+
# Create profiler instance
37+
profiler = CacheSimProfiler("profiling_results")
38+
39+
# Profile specific operations
40+
result = profiler.profile_plugin_cache_operations(
41+
cache, reader, num_requests=1000, method_name="my_plugin_cache"
42+
)
43+
44+
# Generate comprehensive reports
45+
report_file = profiler.generate_performance_report()
46+
json_file = profiler.export_results_json()
47+
csv_file = profiler.export_results_csv()
48+
```
49+
50+
### Context Manager Profiling
51+
52+
```python
53+
from libcachesim.profiler import CacheSimProfiler
54+
55+
profiler = CacheSimProfiler()
56+
57+
with profiler.profile_context("custom_operation") as profile_result:
58+
# Your code to profile here
59+
for request in reader:
60+
cache.get(request)
61+
if some_condition:
62+
break
63+
64+
# Add custom metrics
65+
profile_result.custom_metrics['requests_processed'] = request_count
66+
```
67+
68+
## Understanding the Results
69+
70+
### ProfileResult Object
71+
72+
Each profiling operation returns a `ProfileResult` object with:
73+
74+
- `execution_time`: Total execution time in seconds
75+
- `memory_peak`: Peak memory usage during execution (MB)
76+
- `memory_current`: Memory change from start to end (MB)
77+
- `cpu_percent`: CPU usage percentage
78+
- `custom_metrics`: Dictionary of custom metrics
79+
- `profile_stats`: Detailed cProfile statistics
80+
81+
### Generated Reports
82+
83+
The profiler generates several types of output:
84+
85+
1. **Text Reports** (`profile_*.txt`): Detailed cProfile output showing:
86+
- Top functions by cumulative time
87+
- Top functions by total time
88+
- Function call counts and timing
89+
90+
2. **Performance Reports** (`performance_report_*.txt`): Summary comparing all profiled methods
91+
92+
3. **JSON Export** (`profile_results_*.json`): Machine-readable results for further analysis
93+
94+
4. **CSV Export** (`profile_summary_*.csv`): Spreadsheet-compatible summary data
95+
96+
### Example cProfile Output
97+
98+
```
99+
Top Functions by Cumulative Time:
100+
----------------------------------------
101+
11007 function calls in 0.285 seconds
102+
103+
Ordered by: cumulative time
104+
105+
ncalls tottime percall cumtime percall filename:lineno(function)
106+
1 0.002 0.002 0.285 0.285 profiler.py:194(run_cache_operations)
107+
1000 0.003 0.000 0.279 0.000 plugin_cache.py:118(get)
108+
2000 0.273 0.000 0.273 0.000 {built-in method time.sleep}
109+
1000 0.002 0.000 0.164 0.000 lru_cache.py:253(cache_hit)
110+
```
111+
112+
This shows that `time.sleep` calls are taking 0.273 out of 0.285 total seconds (96% of execution time).
113+
114+
## Plugin Cache Performance Analysis
115+
116+
Plugin caches can have performance issues due to Python callback overhead. The profiler helps identify:
117+
118+
### Common Bottlenecks
119+
120+
1. **Python Callback Overhead**: Each cache operation calls Python functions
121+
2. **Slow Hook Functions**: User-defined cache hooks that are inefficient
122+
3. **Memory Allocation**: Excessive object creation in Python
123+
4. **Data Structure Operations**: Inefficient cache data structure operations
124+
125+
### Example Analysis
126+
127+
```python
128+
# Compare plugin cache vs native implementation
129+
profiler = CacheSimProfiler()
130+
131+
# Profile native LRU
132+
native_cache = LRU(cache_size=1024)
133+
result1 = profiler.profile_trace_processing(native_cache, reader, "native_lru")
134+
135+
# Profile plugin LRU
136+
plugin_cache = PluginCache(
137+
cache_size=1024,
138+
cache_init_hook=init_hook,
139+
cache_hit_hook=hit_hook,
140+
cache_miss_hook=miss_hook,
141+
# ... other hooks
142+
)
143+
result2 = profiler.profile_trace_processing(plugin_cache, reader, "plugin_lru")
144+
145+
# Compare results
146+
comparison = profiler.compare_results()
147+
print(f"Plugin overhead: {comparison['performance_ratio']:.2f}x slower")
148+
```
149+
150+
## Best Practices
151+
152+
### For Plugin Cache Development
153+
154+
1. **Profile Early**: Use profiling during plugin development to catch performance issues
155+
2. **Minimize Hook Complexity**: Keep cache hook functions as simple as possible
156+
3. **Avoid Python Callbacks in Hot Paths**: Consider C++ implementation for critical operations
157+
4. **Use Efficient Data Structures**: Profile different Python data structures for cache storage
158+
159+
### For Performance Analysis
160+
161+
1. **Use Representative Workloads**: Profile with realistic trace data and cache sizes
162+
2. **Run Multiple Iterations**: Performance can vary between runs
163+
3. **Focus on Hot Functions**: Look at cumulative time to find the biggest bottlenecks
164+
4. **Compare Implementations**: Use profiling to validate optimization efforts
165+
166+
### Memory Profiling
167+
168+
```python
169+
# Track memory usage patterns
170+
profiler = CacheSimProfiler()
171+
172+
with profiler.profile_context("memory_analysis") as result:
173+
# Code that might have memory issues
174+
large_cache = PluginCache(cache_size=100000, ...)
175+
result.custom_metrics['cache_size'] = large_cache.cache_size
176+
177+
print(f"Memory peak: {result.memory_peak:.2f}MB")
178+
```
179+
180+
## Example: Finding Plugin Cache Bottlenecks
181+
182+
The provided `examples/profiling_analysis.py` demonstrates:
183+
184+
1. Creating mock plugin caches with artificial delays
185+
2. Comparing native vs plugin implementations
186+
3. Identifying specific bottlenecks in cProfile output
187+
4. Generating comprehensive performance reports
188+
189+
Run it with:
190+
```bash
191+
cd libCacheSim-python
192+
python examples/profiling_analysis.py
193+
```
194+
195+
This shows a 397x performance difference between fast and slow plugin implementations, with detailed function-level analysis showing exactly where time is spent.
196+
197+
## Integration with libCacheSim
198+
199+
When the full libCacheSim package is built, the profiler integrates seamlessly:
200+
201+
```python
202+
import libcachesim as lcs
203+
from libcachesim.profiler import CacheSimProfiler
204+
205+
# Create actual cache and reader
206+
cache = lcs.S3FIFO(cache_size=1024*1024)
207+
reader = lcs.TraceReader(trace="path/to/trace", trace_type=lcs.TraceType.ORACLE_GENERAL_TRACE)
208+
209+
# Profile real operations
210+
profiler = CacheSimProfiler()
211+
result = profiler.profile_trace_processing(cache, reader)
212+
```
213+
214+
## Troubleshooting
215+
216+
### Common Issues
217+
218+
1. **ModuleNotFoundError for psutil**: Install with `pip install psutil`
219+
2. **Empty cProfile output**: Ensure the profiled code actually runs and takes measurable time
220+
3. **High memory usage**: Large traces or cache sizes can use significant memory during profiling
221+
222+
### Performance Tips
223+
224+
- For large traces, use `num_requests` parameter to profile subset of requests
225+
- Clear profiler results with `profiler.clear_results()` between different test runs
226+
- Use the context manager for fine-grained profiling of specific code sections
227+
228+
This profiling system provides the cProfile results requested in the issue and enables comprehensive performance analysis of plugin cache and trace processing operations.

0 commit comments

Comments
 (0)