Skip to content

Add ElfSymbolModule for Parsing ELF Symbol Tables#2384

Merged
brianrob merged 3 commits into
microsoft:mainfrom
brianrob:brianrob/elf-parser
Mar 20, 2026
Merged

Add ElfSymbolModule for Parsing ELF Symbol Tables#2384
brianrob merged 3 commits into
microsoft:mainfrom
brianrob:brianrob/elf-parser

Conversation

@brianrob

@brianrob brianrob commented Mar 19, 2026

Copy link
Copy Markdown
Member

Summary

Adds ElfSymbolModule, an ELF (Executable and Linkable Format) symbol parser that resolves RVAs to symbol names. This enables PerfView and TraceEvent to resolve native symbols from Linux ELF binaries when analyzing universal traces captured on Linux.

Key capabilities

  • Parses both .symtab and .dynsym sections
  • Supports 32-bit and 64-bit ELF, little-endian and big-endian
  • Implements ISymbolLookup for integration with TraceEvent's symbol resolution pipeline
  • Demangles Itanium C++ (_Z) and Rust v0 (_R) mangled names using demanglers from Implement Symbol Demanglers for Linux Binaries #2383.

Design

  • Lazy name resolution: Symbol names are decoded from the string table on first FindNameForRva hit, not during construction. Most symbols in a trace are never looked up — this avoids unnecessary work.
  • LOH-free strtab: String table data retained in a SegmentedList<byte> with 64KB segments.
  • Pre-allocated structures: Two-pass section header scan measures sizes before allocating — zero resizes.
  • BitConverter fast path: 64-bit little-endian symbol entries parsed directly from byte arrays.
  • O(log n) lookups: Sorted array with binary search.

Performance

To give a sense of existing vs. new performance of symbol parsing and lookup.

ELF vs PDB comparison (CoreCLR, 18K symbols, net8.0)

ELF (ElfSymbolModule) PDB (NativeSymbolModule / DIA)
Parse 2.3 ms / 3.0 MB 7.4 ms / 9.6 KB
Lookup (cold) 29 ns / 0 B 8,745 ns / 240 B

ELF parses 3.2x faster than PDB and first lookups are 303x faster (sorted array binary search vs COM interop). PDB's lower parse memory is due to DIA's lazy/deferred loading model.

Lookup benchmarks (zero-alloc)

Symbols Lookup Time
5 7.6 ns
256 15.3 ns
10,772 21.7 ns
18,030 29.0 ns

Testing

Unit tests (26 tests)

Synthetic ELF binaries generated by ElfBuilder (no checked-in test binaries):

  • Error handling: invalid magic, truncated, empty, bad ELF class, no section headers
  • Format coverage: 64-bit LE, 32-bit LE, 64-bit BE, 32-bit BE
  • Symbol filtering: non-function types, zero-value, zero-size, below PT_LOAD base
  • RVA adjustment with pVaddr and pOffset
  • Demangling integration: Itanium C++, Rust v0, plain passthrough
  • .dynsym section parsing alongside .symtab
  • Edge cases: empty table, RVA zero, 100-symbol binary search stress test
  • File path constructor validation

Offline validation against pyelftools (568 real ELF files, 109,293 symbols)

Validated the parser against a known-good reference:

  1. A Python script using pyelftools to extract the ground truth from 568 real .debug files: the executable PT_LOAD segment parameters and all STT_FUNC symbols (st_value, st_size, raw mangled name).
  2. The C# test constructs an ElfSymbolModule with demangling disabled for each file, then verifies that every reference symbol is found at the correct RVA with an exact match on start address and raw name.
  3. Multiple symbols sharing the same start RVA (GCC destructor D1/D2 aliases, LTO clones, ICF-merged functions) are handled by accepting any matching name at that address.

Result: zero mismatches across all 568 files and 109,293 symbols.

Contributes to #2382.

@brianrob brianrob marked this pull request as ready for review March 19, 2026 18:37
@brianrob brianrob requested a review from a team as a code owner March 19, 2026 18:37
brianrob and others added 3 commits March 19, 2026 11:56
Add ElfSymbolModule, which reads ELF (Executable and Linkable Format)
files and resolves RVAs to symbol names. This enables PerfView and
TraceEvent to resolve native symbols from Linux ELF binaries when
analyzing universal traces captured on Linux.

Key capabilities:
- Parses both .symtab and .dynsym sections
- Supports 32-bit and 64-bit ELF, little-endian and big-endian
- Implements ISymbolLookup for integration with TraceEvent's symbol
  resolution pipeline
- Demangles Itanium C++ (_Z) and Rust v0 (_R) mangled names

Design for performance:
- Lazy name resolution: symbol names decoded on first FindNameForRva
  hit, not during construction
- LOH-free strtab via SegmentedList<byte> with 64KB segments
- Two-pass section scan pre-allocates all structures with zero resizes
- BitConverter fast path for 64-bit LE symbol entries
- O(log n) zero-allocation lookups via sorted array binary search

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add comprehensive test suite with 26 tests covering all ElfSymbolModule
code paths:
- Error handling: invalid magic, truncated, empty, bad class, no sections
- Format coverage: 64-bit LE, 32-bit LE, 64-bit BE, 32-bit BE
- Symbol filtering: non-function types, zero-value, zero-size, below PT_LOAD
- RVA adjustment with pVaddr and pOffset
- Demangling integration: Itanium C++, Rust v0, plain passthrough
- .dynsym section parsing alongside .symtab
- Edge cases: empty table, RVA zero, 100-symbol binary search stress
- File path constructor validation

ElfBuilder is a test helper that constructs synthetic minimal ELF
binaries in memory, avoiding the need for checked-in test binaries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add BenchmarkDotNet benchmarks for ELF symbol parsing and lookup using
ElfBuilder-generated synthetic ELF binaries (5, 256, and 10000 symbols).
No external file dependencies.

Parse benchmarks measure construction time and memory allocation.
Lookup benchmarks measure FindNameForRva performance (zero-alloc).

Link ElfBuilder.cs from the test project to share the ELF binary
builder without duplicating code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@brianrob brianrob force-pushed the brianrob/elf-parser branch from 3328404 to e5cf6d2 Compare March 19, 2026 18:56
@brianrob brianrob merged commit bcc0670 into microsoft:main Mar 20, 2026
5 checks passed
@brianrob brianrob deleted the brianrob/elf-parser branch March 20, 2026 20:04
mitchellvette pushed a commit to mitchellvette/zenvizor that referenced this pull request Jul 1, 2026
Updated
[Microsoft.Diagnostics.Tracing.TraceEvent](https://github.com/Microsoft/perfview)
from 3.1.16 to 3.2.4.

<details>
<summary>Release notes</summary>

_Sourced from [Microsoft.Diagnostics.Tracing.TraceEvent's
releases](https://github.com/Microsoft/perfview/releases)._

## 3.2.4

## Security
This release contains security hardening fixes for a number of
malformed-input parsing and path-traversal vulnerabilities:
- Bounds-checking for malformed event payloads in the BPerf ULZ777
decompressor and event-record parser
- Bounds-checking for malformed metadata in the GCDynamic,
RegisteredTraceEventParser (TDH), Dynamic, and EventPipe V3 parsers
- Bounds-checking for malformed PE CodeView and Resource directory
entries
- Path containment hardening for PDB extraction (zipped ETL + container
PDBs), DiagSession resource extraction, R2R perf map writes, PdbScope
module paths, and dynamic manifest writes
- Path-traversal and command-execution hardening for Source Server
lookups

## What's Changed
* Update CsWin32 Package Version by @​brianrob in
microsoft/perfview#2425
* Fix incorrect field offsets when parsing ETW events with fixed-count
array fields by @​Copilot in
microsoft/perfview#2427
* Retarget Native Profiler Builds To VS 2026 V145 Toolset by @​brianrob
in microsoft/perfview#2428
* Stabilize XamlMessageBox UI-thread dispatch test by @​brianrob in
microsoft/perfview#2430

**Full Changelog**:
microsoft/perfview@v3.2.3...v3.2.4


## 3.2.3

## What's Changed
* Upgrade Microsoft.Windows.CsWin32 to 0.3.209 (GHSA-ghhp-997w-qr28) by
@​Copilot in microsoft/perfview#2409
* Enable Spectre mitigations and linker optimizations for EtwClrProfiler
by @​danmoseley in microsoft/perfview#2410
* Fix 'unhanded' / 'occured' typos in UnhandledExceptionDialog body text
by @​SAY-5 in microsoft/perfview#2413
* Fix GCStats failures on dotnet trace gc-verbose collections (#​2414)
by @​cincuranet in microsoft/perfview#2415
* C entrypoint fixes by @​zachcmadsen in
microsoft/perfview#2421

## New Contributors
* @​SAY-5 made their first contribution in
microsoft/perfview#2413

**Full Changelog**:
microsoft/perfview@v3.2.2...v3.2.3

## 3.2.2

## What's Changed
* Fix PDB Symbol Resolution for Unmerged Windows Traces by @​brianrob in
microsoft/perfview#2407


**Full Changelog**:
microsoft/perfview@v3.2.1...v3.2.2

## 3.2.1

## Native and R2R Symbol Download and Parsing Now Available
As of this release, if you capture a trace using [`dotnet-trace
collect-linux`](https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-trace#dotnet-trace-collect-linux)
or
[`record-trace`](https://github.com/microsoft/one-collect/tree/main/record-trace),
**native and R2R symbols can now be downloaded and resolved at analysis
time**. All .NET symbols (both native and R2R) are available on the
Microsoft Symbol Server. Additionally, many Azure Linux symbol files are
available on the Microsoft Symbol Server. For those targeting other
distros, PerfView and TraceEvent are capable of pulling those symbol
files from local directories by adding a local symbol path pointing to
the files.

Most of the work for this was completed in PerfView and TraceEvent 3.2.1
with the final required fixes present in this release.

## What's Changed
* Optimize nettrace-to-TraceLog Conversion by @​brianrob in
microsoft/perfview#2403
* Embed missing System.Text.Json transitive dependencies in PerfView by
@​brianrob in microsoft/perfview#2404


**Full Changelog**:
microsoft/perfview@v3.2.0...v3.2.1

## 3.2.0

## What's Changed
* Fix Debug.Assert failures in SpeedScope tests and
DynamicTraceEventParser by @​brianrob in
microsoft/perfview#2368
* Add TraceParserGen.Tests project and fix code generation bugs by
@​Copilot in microsoft/perfview#2308
* Update UsersGuide.htm by @​AftabAnsari10662 in
microsoft/perfview#2370
* Strip .il and .ni suffixes from TraceModuleFile.Name by @​leculver in
microsoft/perfview#2364
* Handle provider names that start with a numeric digit. by @​brianrob
in microsoft/perfview#2369
* Dispose WebView2 controls before Environment.Exit to prevent finalizer
crash by @​brianrob in microsoft/perfview#2371
* Refactor GetManifestForRegisteredProvider to use XmlWriter by
@​Copilot in microsoft/perfview#2353
* docs: Add investigation guidance for JIT-inlined missing stack frames
by @​Copilot in microsoft/perfview#2377
* Fix spurious BROKEN frame at top of Linux thread stacks in CPU Stacks
viewer by @​Copilot in microsoft/perfview#2375
* Fix NRE in AddUniversalDynamicSymbol for invalid symbol address ranges
by @​brianrob in microsoft/perfview#2376
* Add missing authority parameter to log by @​hoyosjs in
microsoft/perfview#2379
* Replace individual code owners with microsoft/perfview-reviewers group
by @​brianrob in microsoft/perfview#2381
* Fix Dynamic Symbol Resolution for Mappings Shared Across Multiple
Processes in Universal Traces by @​brianrob in
microsoft/perfview#2380
* Implement Symbol Demanglers for Linux Binaries by @​brianrob in
microsoft/perfview#2383
* Fix NullReferenceException race condition in
TraceLog.AllocLookup/FreeLookup by @​Copilot in
microsoft/perfview#2387
* Add typed schema for AllocationSampled (EventID 303, .NET 10+) in
ClrTraceEventParser by @​Copilot in
microsoft/perfview#2388
* Add ElfSymbolModule for Parsing ELF Symbol Tables by @​brianrob in
microsoft/perfview#2384
* Update BDN to latest version. by @​cincuranet in
microsoft/perfview#2389
* Fixed overflow when working with large dumps by @​remilema in
microsoft/perfview#2399
* Fix XamlMessageBox STA Threading Crash from Background Threads by
@​brianrob in microsoft/perfview#2400
* Add ELF Symbol Resolution for Linux .nettrace Traces by @​brianrob in
microsoft/perfview#2397
* Add Missing WCF Event Templates by @​brianrob in
microsoft/perfview#2390

## New Contributors
* @​AftabAnsari10662 made their first contribution in
microsoft/perfview#2370
* @​remilema made their first contribution in
microsoft/perfview#2399

**Full Changelog**:
microsoft/perfview@v3.1.30...v3.2.0

## 3.1.30

## What's Changed
* doc: fix typos by @​chinwobble in
microsoft/perfview#2359
* Fix SourceLink parsing to support both wildcard and exact path
mappings by @​ivberg in microsoft/perfview#2355
* add horizontal scrolling to eventviewer by @​logangeorge01 in
microsoft/perfview#2361
* Add SHA-384 and SHA-512 hash algorithm support for PDB checksums by
@​Copilot in microsoft/perfview#2366

## New Contributors
* @​chinwobble made their first contribution in
microsoft/perfview#2359
* @​logangeorge01 made their first contribution in
microsoft/perfview#2361

**Full Changelog**:
microsoft/perfview@v3.1.29...v3.1.30

## 3.1.29

## What's Changed
* Warn users when circular buffer overflow causes missing type info in
allocation views for selected processes by @​Copilot in
microsoft/perfview#2326
* Special-Case BitMask Parsing by @​brianrob in
microsoft/perfview#2327
* Refactor PEFile and PEHeader to use ReadOnlySpan exclusively with
zero-copy buffer sharing by @​Copilot in
microsoft/perfview#2317
* Fix cdbstack parser dropping last sample and missing metrics by
@​Copilot in microsoft/perfview#2329
* Fix unhandled ArgumentOutOfRangeException when exporting FlameGraph
with unrendered canvas by @​Copilot in
microsoft/perfview#2339
* Add guidance for capturing ETW traces in Kubernetes pods by @​Copilot
in microsoft/perfview#2344
* Fix merge command line order in kubernetes documentation by @​Copilot
in microsoft/perfview#2346
* Fix GetRegisteredOrEnabledProviders() documentation claiming list is
small by @​Copilot in microsoft/perfview#2348
* Fix duplicate stringTable elements in instrumentation manifest by
@​Copilot in microsoft/perfview#2347
* Fix Histogram.AddMetric losing values after single-bucket to array
transition by @​Copilot in
microsoft/perfview#2337
* Fix clipboard copy formatting based on selection dimensions in Stack
Viewer by @​Copilot in microsoft/perfview#2332
* Fix XML escaping in GetManifestForRegisteredProvider by @​Copilot in
microsoft/perfview#2351
* Fix race condition in ProviderNameToGuid causing
ERROR_INSUFFICIENT_BUFFER crashes by @​Copilot in
microsoft/perfview#2357


**Full Changelog**:
microsoft/perfview@v3.1.28...v3.1.29

## 3.1.28

## What's Changed
* Add support for Boolean8 to NetTrace V6. by @​noahfalk in
microsoft/perfview#2318
* Implement A Thread Time View for Universal Traces by @​brianrob in
microsoft/perfview#2320
* Remove Incorrect Argument Description by @​brianrob in
microsoft/perfview#2323

**Full Changelog**:
microsoft/perfview@v3.1.26...v3.1.28

## 3.1.26

Roll-up through 2025/10/10.

* Only dispose non-null handles in `ETWTraceEventSource`
[#​2291](microsoft/perfview#2291)
* Small cleanup in `NettraceUniversalConverter`
[#​2292](microsoft/perfview#2292)
* Fix hyperlink focus visibility in dark mode and improve keyboard
navigation [#​2295](microsoft/perfview#2295)
* Gracefully handle invalid characters in `PATH`
[#​2296](microsoft/perfview#2296)
* Fix copying First/Last columns with pipe symbols to work in time range
input [#​2304](microsoft/perfview#2304)


## 3.1.24

Roll-up through 2025/08/26.

* Implement NuGet Central Package Version Management [#​2262]
* Fix broken stacks warning for universal traces [#​2268]
* Fix jitted code symbols in universal traces to show assembly names
instead of memfd:doublemapper [#​2269]
* Use themed background brush for menu and filter [#​2272]
* Improve rendering and dark mode [#​2274]
* Implement configurable symbol server authentication with /SymbolsAuth
command line argument for PerfView and HeapDump [#​2278]
* Add a themed dialog [#​2276]
* Fix regression: "Goto Item in Callers/Callees" now accumulates across
all threads [#​2284]
* Fix parsing issues and add support for additional events to the Linux
perf text file parser [#​2286]
* Fix TraceLog live session RelatedActivityID/ContainerID corruption by
preserving ExtendedData [#​2285]
* NetTrace LabelList metadata overrides and metadata flushing [#​2281]
* Fix NullReferenceException in ProviderBrowser.LevelSelected when
deselecting level [#​2289]

## 3.1.23

Roll-up through 2025/07/11.

- Fixed TraceEvent CaptureState API to support previously unsupported
keyword configurations. [#​2222]
- Added Exception Stacks view for .nettrace files to enhance exception
diagnostics. [#​2223]
- Corrected outdated documentation references to "GC Heap Alloc Stacks".
[#​2224]
- Fixed off-by-one error in P/Invoke buffer handling for Windows volume
events. [#​2227]
- Fixed broken links in the PerfView user guide. [#​2225]
- Improved error handling by throwing when TdhEnumerateProviders fails,
enabling better diagnostics. [#​2177]
- Added AutomationProperties.Name to the Process Selection DataGrid for
improved accessibility. [#​2239]
- Fixed focus indicator visibility for hyperlinks in dark mode and high
contrast themes. [#​2235]
- Addressed NullReferenceException in Anti-Malware view. [#​2233]
- Fixed WebView2 crash on close by implementing proper disposal pattern.
[#​2230]
- Added support for native AOT gcdumps, expanding compatibility with
modern .NET workloads. [#​2242]
- Fixed NVDA screen reader issue where Theme menu items did not announce
selection state. [#​2237]
- Extended PredefinedDynamicTraceEventParser to support dynamic events
from additional sources. [#​2232]
- Implemented MSFZ symbol format support in SymbolReader. [#​2244]
- Removed usage of DefaultAzureCredential, simplifying authentication
dependencies. [#​2255]
- Added option to hide TimeStamp columns in the EventWindow View menu.
[#​2247]
- Fixed NVDA screen reader reporting incorrect list count for File menu
separators. [#​2257]
- Fixed unhandled exception when double-clicking in scroll bar area with
no content. [#​2254]
- Fixed universal symbol conversion for overlapping mappings. [#​2252]
- Fixed TraceEvent.props to respect ProcessorArchitecture when
RuntimeIdentifier is set. [#​2249]

## 3.1.22

Roll-up through 2025/06/04.

- Added GC Heap Analyzer support for .nettrace files to enhance memory
analysis workflows. [#​2216]
- Introduced PredefinedDynamicTraceEventParser for known
TraceLogging events, improving trace event parsing. [#​2220]
- Enabled selection of process trees in the process selection dialog for
multi-process analysis, allowing deeper inspection across related
processes. [#​2195]
- Implemented sorting for the Duration column in the process selection
dialog using TotalDurationSeconds, improving usability. [#​2194]
- Improved NetTrace parameter parsing for better command-line
flexibility. [#​2200]
- Fixed GetActiveSessionNames to handle ERROR_MORE_DATA, resolving
session enumeration issues. [#​2196]
- Fixed ObjectDisposedException when opening Net OS Heap Alloc Stacks,
improving stability. [#​2212]
- Fixed null reference exception in GenFragmentationPercent method,
enhancing reliability. [#​2211]
 - Fixed TreeView auto-expansion when opening trace files. [#​2218]
- Fixed StackViewer issue where "Set Time Range" reset "Goto Items by
callees". [#​2208]
- Fixed markdown table formatting when copying from the stack viewer.
[#​2203]
- Fixed TraceEvent NuGet package to exclude Windows-specific native
DLLs. [#​2215]
- Removed PDB generation for .NET Core assemblies using CrossGen,
reducing build overhead. [#​2202]
- Made symbol server timeout configurable and removed dead code in
SymbolReader. [#​2209]
- Changed help ribbons to use textblocks, enabling tab navigation.
[#​2201]

## 3.1.21

Roll-up through 2025/05/02.

- Change NetTrace format version support
- Add /OSHeapMaxMB to set a max size for OS heap sessions
- Implement Nettrace Support for Traces with Universal Providers
- Implement R2R Symbol Lookup for Linux Traces
- Fix NetTrace parsing for Start/Stop event names
- Fix IndexOutOfRangeException in ProcessGlobalHistory

## 3.1.20

Roll-up through 2025/04/01.

 - Flamegraph and drill-in menu improvements
 - Performance improvements around unhandled event dispatch
 - Add configurable real-time delay in TraceLogEventSource
- Don't queue another flush during a real-time ETW session if one is
already in-process
- Allow configuration of rundown providers for real-time EventPipe
sessions
 - Fix stack handling for NetTrace V4
 - Add multi-line view for events viewer
 - Misc accessibility fixes

## 3.1.19

Roll-up through 2025/01/30.

 - Added missing time information in the Raw XML View for GCStats.
 - Updated activity computation logic to support OpenTelemetry events.
 - Changed timestamp values to use QPC time based on UTC for Relogger.
 - Fixed issues with report command handling.
 - Addressed various POH-related issues.
- Implemented file-size limitation for rundown using an ETW-based
approach.

## 3.1.18

Roll-up through 2024/12/11.

 - Fixed `perfcollect` install script on Azure Linux 3.
- Updated `System.Text.Json` to address
dotnet/announcements#329.

## 3.1.17

Roll-up through 2024/11/08.

- Numerous accessibility fixes to PerfView. This includes switching out
the previous web browser plugin to use WebView2.
- Breaking changes to FastSerialization to ensure that only expected
types are deserialized. This addresses potential vulnerabilities during
deserialization of untrusted input. Details at
microsoft/perfview#2121.

Commits viewable in [compare
view](microsoft/perfview@v3.1.16...v3.2.4).
</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants