Description
Is your feature request related to a problem?
For the initial iteration of zPages, more specifically TraceZ, spans are grabbed and temporarily stored using the TraceZ span processor.
The OT C++ span processors interface doesn't store any spans by default, and other current implementations pass the responsibility of ownership to an exporter at most. The TraceZ span processor functionality deviates from these by storing both running and completed spans, keeping references and ownership of them respectively.
The processor's containers are modified whenever a snapshot getter is called or a span starts/ends, which causes potential thread safety issues when the functions are called concurrently since these containers are shared across these functions. When different functions attempt to read/write to the same place in memory simultaneously, this causes a program to crash.
In order to make the span processor thread-safe, lock guards were added at these functions. At a large scale where many spans could be processed at once, this could potentially make TraceZ scale poorly speed-wise.
Describe the solution you'd like
We want to consider solutions that are also fast while being thread-safe. Some proposed solutions include:
- Use a proxy/shim instead of a processor. Similar to codeless attach profiling tools like strace @maxgolov
- Storing spans in a lock-free circular buff @pyohannes
Describe alternatives you've considered
Ideally, we also want to reduce contention (how long services query and use the same places in memory). We attempted to do this through copy-on-write and are considering other methods of doing so.
Additional context
- TraceZ Span Processor PR
- TraceZ Span Processor Design Doc