|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +This is the Datadog Java Profiler Library, a specialized profiler derived from async-profiler but tailored for Datadog's needs. It's a multi-language project combining Java, C++, and Gradle build system with native library compilation. |
| 8 | + |
| 9 | +**Key Technologies:** |
| 10 | +- Java 8+ (main API and library loading) |
| 11 | +- C++17 (native profiling engine) |
| 12 | +- Gradle (build system with custom native compilation) |
| 13 | +- JNI (Java Native Interface for C++ integration) |
| 14 | +- CMake (for C++ unit tests via Google Test) |
| 15 | + |
| 16 | +## Build Commands |
| 17 | + |
| 18 | +### Main Build Tasks |
| 19 | +```bash |
| 20 | +# Build release version (primary artifact) |
| 21 | +./gradlew buildRelease |
| 22 | + |
| 23 | +# Build all configurations |
| 24 | +./gradlew assembleAll |
| 25 | + |
| 26 | +# Clean build |
| 27 | +./gradlew clean |
| 28 | +``` |
| 29 | + |
| 30 | +### Development Builds |
| 31 | +```bash |
| 32 | +# Debug build with symbols |
| 33 | +./gradlew buildDebug |
| 34 | + |
| 35 | +# ASan build (if available) |
| 36 | +./gradlew buildAsan |
| 37 | + |
| 38 | +# TSan build (if available) |
| 39 | +./gradlew buildTsan |
| 40 | +``` |
| 41 | + |
| 42 | +### Testing |
| 43 | +```bash |
| 44 | +# Run all tests |
| 45 | +./gradlew test |
| 46 | + |
| 47 | +# Run specific test configurations |
| 48 | +./gradlew testRelease |
| 49 | +./gradlew testDebug |
| 50 | +./gradlew testAsan |
| 51 | +./gradlew testTsan |
| 52 | + |
| 53 | +# Run C++ unit tests only |
| 54 | +./gradlew gtestDebug |
| 55 | +./gradlew gtestRelease |
| 56 | + |
| 57 | +# Cross-JDK testing |
| 58 | +JAVA_TEST_HOME=/path/to/test/jdk ./gradlew testDebug |
| 59 | +``` |
| 60 | + |
| 61 | +### Build Options |
| 62 | +```bash |
| 63 | +# Skip native compilation |
| 64 | +./gradlew build -Pskip-native |
| 65 | + |
| 66 | +# Skip all tests |
| 67 | +./gradlew build -Pskip-tests |
| 68 | + |
| 69 | +# Skip C++ tests |
| 70 | +./gradlew build -Pskip-gtest |
| 71 | + |
| 72 | +# Keep JFR recordings after tests |
| 73 | +./gradlew test -PkeepJFRs |
| 74 | + |
| 75 | +# Skip debug symbol extraction |
| 76 | +./gradlew buildRelease -Pskip-debug-extraction=true |
| 77 | +``` |
| 78 | + |
| 79 | +### Code Quality |
| 80 | +```bash |
| 81 | +# Format code |
| 82 | +./gradlew spotlessApply |
| 83 | + |
| 84 | +# Static analysis |
| 85 | +./gradlew scanBuild |
| 86 | + |
| 87 | +# Run stress tests |
| 88 | +./gradlew :ddprof-stresstest:runStressTests |
| 89 | + |
| 90 | +# Run benchmarks |
| 91 | +./gradlew runBenchmarks |
| 92 | +``` |
| 93 | + |
| 94 | +## Architecture |
| 95 | + |
| 96 | +### Module Structure |
| 97 | +- **ddprof-lib**: Main profiler library (Java + C++) |
| 98 | +- **ddprof-test**: Integration tests |
| 99 | +- **ddprof-test-tracer**: Tracing context tests |
| 100 | +- **ddprof-stresstest**: JMH-based performance tests |
| 101 | +- **malloc-shim**: Memory allocation interceptor (Linux only) |
| 102 | + |
| 103 | +### Build Configurations |
| 104 | +The project supports multiple build configurations per platform: |
| 105 | +- **release**: Optimized production build with stripped symbols |
| 106 | +- **debug**: Debug build with full symbols |
| 107 | +- **asan**: AddressSanitizer build for memory error detection |
| 108 | +- **tsan**: ThreadSanitizer build for thread safety validation |
| 109 | + |
| 110 | +### Upstream Integration |
| 111 | +The project maintains integration with async-profiler upstream: |
| 112 | +- `cloneAsyncProfiler`: Clones DataDog's async-profiler fork |
| 113 | +- `copyUpstreamFiles`: Copies selected upstream files to `ddprof-lib/src/main/cpp-external` |
| 114 | +- `patchStackFrame`/`patchStackWalker`: Applies necessary patches for ASAN compatibility |
| 115 | +- Lock file: `gradle/ap-lock.properties` specifies branch/commit |
| 116 | + |
| 117 | +### Key Source Locations |
| 118 | +- Java API: `ddprof-lib/src/main/java/com/datadoghq/profiler/JavaProfiler.java` |
| 119 | +- C++ engine: `ddprof-lib/src/main/cpp/` |
| 120 | +- Upstream C++ code: `ddprof-lib/src/main/cpp-external/` (generated) |
| 121 | +- Native libraries: `ddprof-lib/build/lib/main/{config}/{os}/{arch}/` |
| 122 | +- Test resources: `ddprof-test/src/test/java/` |
| 123 | + |
| 124 | +### Platform Support |
| 125 | +- **Linux**: x64, arm64 (primary platforms) |
| 126 | +- **macOS**: arm64, x64 |
| 127 | +- **Architecture detection**: Automatic via `common.gradle` |
| 128 | +- **musl libc detection**: Automatic detection and handling |
| 129 | + |
| 130 | +### Debug Information Handling |
| 131 | +Release builds automatically extract debug symbols: |
| 132 | +- Stripped libraries (~1.2MB) for production |
| 133 | +- Separate debug files (~6.1MB) with full symbols |
| 134 | +- GNU debuglink sections connect stripped libraries to debug files |
| 135 | + |
| 136 | +## Development Workflow |
| 137 | + |
| 138 | +### Running Single Tests |
| 139 | +Use standard Gradle syntax: |
| 140 | +```bash |
| 141 | +./gradlew :ddprof-test:test --tests "ClassName.methodName" |
| 142 | +``` |
| 143 | + |
| 144 | +### Working with Native Code |
| 145 | +Native compilation is automatic during build. C++ code changes require: |
| 146 | +1. Full rebuild: `./gradlew clean build` |
| 147 | +2. The build system automatically handles JNI headers and platform detection |
| 148 | + |
| 149 | +### Debugging Native Issues |
| 150 | +- Use `buildDebug` for debug symbols |
| 151 | +- Use `buildAsan` for memory error detection |
| 152 | +- Check `gradle/sanitizers/*.supp` for suppressions |
| 153 | +- Set `sudo sysctl vm.mmap_rnd_bits=28` if ASan crashes occur |
| 154 | + |
| 155 | +### Cross-Platform Development |
| 156 | +- Use `osIdentifier()` and `archIdentifier()` functions for platform detection |
| 157 | +- Platform-specific code goes in `os_linux.cpp`, `os_macos.cpp`, etc. |
| 158 | +- Build configurations automatically select appropriate compiler/linker flags |
| 159 | + |
| 160 | +## Publishing and Artifacts |
| 161 | + |
| 162 | +The main artifact is `ddprof-<version>.jar` containing: |
| 163 | +- Java classes |
| 164 | +- Native libraries for all supported platforms |
| 165 | +- Metadata for library loading |
| 166 | + |
| 167 | +Build artifacts structure: |
| 168 | +``` |
| 169 | +ddprof-lib/build/ |
| 170 | +├── lib/main/{config}/{os}/{arch}/ |
| 171 | +│ ├── libjavaProfiler.{so|dylib} # Full library |
| 172 | +│ ├── stripped/ → production binary |
| 173 | +│ └── debug/ → debug symbols |
| 174 | +└── native/{config}/META-INF/native-libs/ |
| 175 | + └── {os}-{arch}/ → final packaged libraries |
| 176 | +``` |
| 177 | + |
| 178 | +## Core Architecture Components |
| 179 | + |
| 180 | +### Double-Buffered Call Trace Storage |
| 181 | +The profiler uses a sophisticated double-buffered storage system for call traces: |
| 182 | +- **Active Storage**: Currently accepting new traces from profiling events |
| 183 | +- **Standby Storage**: Background storage for JFR serialization and trace preservation |
| 184 | +- **Instance-based Trace IDs**: 64-bit IDs combining instance ID (upper 32 bits) and slot (lower 32 bits) |
| 185 | +- **Liveness Checkers**: Functions that determine which traces to preserve across storage swaps |
| 186 | +- **Atomic Swapping**: Lock-free swap operations to minimize profiling overhead |
| 187 | + |
| 188 | +### JFR Integration Architecture |
| 189 | +- **FlightRecorder**: Central JFR event recording and buffer management |
| 190 | +- **Metadata Generation**: Dynamic JFR metadata for stack traces, methods, and classes |
| 191 | +- **Constant Pools**: Efficient deduplication of strings, methods, and stack traces |
| 192 | +- **Buffer Management**: Thread-local recording buffers with configurable flush thresholds |
| 193 | + |
| 194 | +### Native Integration Patterns |
| 195 | +- **Upstream Sync**: Uses DataDog fork of async-profiler with branch `dd/master` |
| 196 | +- **Adapter Pattern**: `*_dd.h` files adapt upstream code for Datadog needs |
| 197 | +- **External Code**: Upstream files copied to `cpp-external/` with minimal patches |
| 198 | +- **Signal Handler Safety**: Careful memory management in signal handler contexts |
| 199 | + |
| 200 | +### Multi-Engine Profiling System |
| 201 | +- **CPU Profiling**: SIGPROF-based sampling with configurable intervals |
| 202 | +- **Wall Clock**: SIGALRM-based sampling for blocking I/O and sleep detection |
| 203 | +- **Allocation Profiling**: TLAB-based allocation tracking and sampling |
| 204 | +- **Live Heap**: Object liveness tracking with weak references and GC integration |
| 205 | + |
| 206 | +## Critical Implementation Details |
| 207 | + |
| 208 | +### Thread Safety and Performance |
| 209 | +- **Lock-free Hot Paths**: Signal handlers avoid blocking operations |
| 210 | +- **Thread-local Buffers**: Per-thread recording buffers minimize contention |
| 211 | +- **Atomic Operations**: Instance ID management and counter updates use atomics |
| 212 | +- **Memory Allocation**: Minimize malloc() in hot paths, use pre-allocated containers |
| 213 | + |
| 214 | +### 64-bit Trace ID System |
| 215 | +- **Collision Avoidance**: Instance-based IDs prevent collisions across storage swaps |
| 216 | +- **JFR Compatibility**: 64-bit IDs work with JFR constant pool indices |
| 217 | +- **Stability**: Trace IDs remain stable during liveness preservation |
| 218 | +- **Performance**: Bit-packing approach avoids atomic operations in hot paths |
| 219 | + |
| 220 | +### Platform-Specific Handling |
| 221 | +- **musl libc Detection**: Automatic detection and symbol resolution adjustments |
| 222 | +- **Architecture Support**: x64, arm64 with architecture-specific stack walking |
| 223 | +- **Debug Symbol Handling**: Split debug information for production deployments |
| 224 | + |
| 225 | +## Development Guidelines |
| 226 | + |
| 227 | +### Code Organization Principles |
| 228 | +- **Namespace Separation**: Use `ddprof` namespace for adapted upstream classes |
| 229 | +- **File Naming**: Datadog adaptations use `*_dd` suffix (e.g., `stackWalker_dd.h`) |
| 230 | +- **External Dependencies**: Upstream code in `cpp-external/`, local code in `cpp/` |
| 231 | + |
| 232 | +### Performance Constraints |
| 233 | +- **Algorithmic Complexity**: Use O(N) or better, max 256 elements for linear scans |
| 234 | +- **Memory Fragmentation**: Minimize allocations to avoid malloc arena issues |
| 235 | +- **Signal Handler Safety**: No blocking operations, mutex locks, or malloc() in handlers |
| 236 | + |
| 237 | +### Testing Strategy |
| 238 | +- **Multi-configuration Testing**: Test across debug, release, ASan, and TSan builds |
| 239 | +- **Cross-JDK Compatibility**: Test with Oracle JDK, OpenJDK, and OpenJ9 |
| 240 | +- **Native-Java Integration**: Both C++ unit tests (gtest) and Java integration tests |
| 241 | +- **Stress Testing**: JMH-based performance and stability testing |
| 242 | + |
| 243 | +### Debugging and Analysis |
| 244 | +- **Debug Builds**: Use `buildDebug` for full symbols and debugging information |
| 245 | +- **Sanitizer Builds**: ASan for memory errors, TSan for threading issues |
| 246 | +- **Static Analysis**: `scanBuild` for additional code quality checks |
| 247 | +- **Test Logging**: Use `TEST_LOG` macro for debug output in tests |
| 248 | + |
| 249 | +### Upstream Integration Workflow |
| 250 | +The project maintains a carefully managed relationship with async-profiler upstream: |
| 251 | +1. **Lock File**: `gradle/ap-lock.properties` specifies exact upstream commit |
| 252 | +2. **Branch Tracking**: `dd/master` branch contains safe upstream changes |
| 253 | +3. **File Copying**: `copyUpstreamFiles` task selectively imports upstream code |
| 254 | +4. **Minimal Patching**: Only essential patches for ASan compatibility |
| 255 | +5. **Cherry-pick Strategy**: Rare cherry-picks only for critical fixes |
| 256 | + |
| 257 | +## Build System Architecture |
| 258 | + |
| 259 | +### Gradle Multi-project Structure |
| 260 | +- **ddprof-lib**: Core profiler with native compilation |
| 261 | +- **ddprof-test**: Integration and Java unit tests |
| 262 | +- **ddprof-test-tracer**: Tracing context integration tests |
| 263 | +- **ddprof-stresstest**: JMH performance benchmarks |
| 264 | +- **malloc-shim**: Linux memory allocation interceptor |
| 265 | + |
| 266 | +### Native Compilation Pipeline |
| 267 | +- **Platform Detection**: Automatic OS and architecture detection via `common.gradle` |
| 268 | +- **Configuration Matrix**: Multiple build configs (release/debug/asan/tsan) per platform |
| 269 | +- **Symbol Processing**: Automatic debug symbol extraction for release builds |
| 270 | +- **Library Packaging**: Final JAR contains all platform-specific native libraries |
| 271 | + |
| 272 | +### Artifact Structure |
| 273 | +Final artifacts maintain a specific structure for deployment: |
| 274 | +``` |
| 275 | +META-INF/native-libs/{os}-{arch}/libjavaProfiler.{so|dylib} |
| 276 | +``` |
| 277 | +With separate debug symbol packages for production debugging support. |
| 278 | + |
| 279 | +## Legacy and Compatibility |
| 280 | + |
| 281 | +- Java 8 compatibility maintained throughout |
| 282 | +- JNI interface follows async-profiler conventions |
| 283 | +- Supports Oracle JDK, OpenJDK and OpenJ9 implementations |
| 284 | +- Always test with ./gradlew testDebug |
| 285 | +- Always consult openjdk source codes when analyzing profiler issues and looking for proposed solutions |
| 286 | +- For OpenJ9 specific issues consul the openj9 github project |
| 287 | +- don't use assemble task. Use assembleDebug or assembleRelease instead |
| 288 | +- gtest tests are located in ddprof-lib/src/test/cpp |
| 289 | +- Module ddprof-lib/gtest is only containing the gtest build setup |
| 290 | +- Java unit tests are in ddprof-test module |
| 291 | +- Always run ./gradlew spotlessApply before commiting the changes |
| 292 | + |
| 293 | +- When you are adding copyright - like 'Copyright 2021, 2023 Datadog, Inc' do the current year -> 'Copyright <current year>, Datadog, Inc' |
| 294 | + When you are modifying copyright already including 'Datadog' update the 'until year' ('Copyright from year, until year') to the current year |
| 295 | +- If modifying a file that does not contain Datadog copyright, add one |
| 296 | +- When proposing solutions try minimizing allocations. We are fighting hard to avoid fragmentation and malloc arena issues |
| 297 | +- Use O(N) or worse only in small amounts of elements. A rule of thumb cut-off is 256 elements. Anything larger requires either index or binary search to get better than linear performance |
| 298 | + |
| 299 | +- Always run ./gradlew spotlessApply before committing changes |
| 300 | + |
| 301 | +- Always create a commit message based solely on the actual changes visible in the diff |
| 302 | + |
| 303 | +- You can use TEST_LOG macro to log debug info which can then be used in ddprof-test tests to assert correct execution. The macro is defined in 'common.h' |
| 304 | + |
| 305 | +- If a file is containing copyright, make sure it is preserved. The only exception is if it mentions Datadog - then you can update the years, if necessary |
| 306 | +- Always challange my proposals. Use deep analysis and logic to find flaws in what I am proposing |
| 307 | + |
| 308 | +- Exclude ddprof-lib/build/async-profiler from searches of active usage |
0 commit comments